Search results

  • 2023

    Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations

    Boito, F., Brandt, J., Cardellini, V., Carns, P., Ciorba, F. M., Egan, H., Eleliemy, A., Gentile, A., Gruber, T., Hanson, J., Haus, U. U., Huck, K., Ilsche, T., Jakobsche, T., Jones, T., Karlsson, S., Mueen, A., Ott, M., Patki, T., Raghavan, K., & 6 othersSimms, S., Shoga, K., Showerman, M., Tiwari, D., Wilde, T. & Yamamoto, K., 2023, Proceedings of 2023 IEEE International Conference on Cluster Computing Workshops and Posters. IEEE, p. 37-43

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Challenges in HPCQC Integration

    Elsharkawy, A., To, X-T. M., Seitz, P., Chen, Y., Stade, Y., Geiger, M., Huang, Q., Guo, X., Ansari, M. A., Ruefenacht, M., Schulz, L., Karlsson, S., Mendl, C. B., Kranzlmüller, D. & Schulz, M., 22 Sept 2023, Proceedings of 2023 IEEE International Conference on Quantum Computing and Engineering . IEEE, p. 405-406 2 p. 10313875

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Design of an Application-specific VLIW Vector Processor for ORB Feature Extraction

    Ferreira, L., Malkowsky, S., Persson, P., Karlsson, S., Åström, K. & Liu, L., 2023, In: Journal of Signal Processing Systems. 95, p. 863–875

    Research output: Contribution to journalJournal articleResearchpeer-review

    Open Access
    File
    18 Downloads (Pure)
  • Improving a Multigrid Poisson Solver with Peer-to-Peer Communication and Task Dependencies

    Rydahl, A. & Karlsson, S., 2023, OpenMP: Advanced Task-Based, Device and Compiler Programming. McIntosh-Smith, S., Deakin, T., Klemm, M., de Supinski, B. R. & Klinkenberg, J. (eds.). Springer, p. 129-143 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 14114 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Modeling of Errors in Quantum Computers with Generated Structural Circuits

    Schneider, J., Gammelmark, M. & Karlsson, S., 2023, Proceedings of 2023 IEEE International Conference on Quantum Computing and Engineering (QCE). IEEE, p. 122-126 5 p.

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • OpenMP Target Offload Utilizing GPU Shared Memory

    Gammelmark, M., Rydahl, A. & Karlsson, S., 2023, 19th International Workshop on OpenMP. Springer, Vol. 14114. p. 114-128

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Simple and efficient GPU accelerated topology optimisation: Codes and applications

    Träff, E. A., Rydahl, A., Karlsson, S., Sigmund, O. & Aage, N., 2023, In: Computer Methods in Applied Mechanics and Engineering. 410, 26 p., 116043.

    Research output: Contribution to journalJournal articleResearchpeer-review

    Open Access
    File
    186 Downloads (Pure)
  • 2022

    Feasibility Studies in Multi-GPU Target Offloading

    Rydahl, A., Gammelmark, M. & Karlsson, S., 2022, OpenMP in a Modern World: From Multi-device Support to Meta Programming - 18th International Workshop on OpenMP, IWOMP 2022, Proceedings. Klemm, M., de Supinski, B. R., Klinkenberg, J. & Neth, B. (eds.). Springer, p. 81-93 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 13527 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • 2021

    Energy-Efficient Application-Specific Instruction-Set Processor for Feature Extraction in Smart Vision Systems

    Ferreira, L., Malkowsky, S., Persson, P., Karlsson, S., Astrom, K. & Liu, L., 2021, Proceedings of 55th Asilomar Conference on Signals, Systems, and Computers. IEEE, p. 324-328

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • 2017

    Improving Loop Dependence Analysis

    Jensen, N. B. & Karlsson, S., 2017, In: ACM Transactions on Architecture and Code Optimization. 14, 3, p. 1-24 24 p.

    Research output: Contribution to journalJournal articleResearchpeer-review

  • 2016

    A scalable lock-free hash table with open addressing

    Nielsen, J. P. & Karlsson, S., 2016, Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. Association for Computing Machinery, p. 1-2 2 p. 33. (ACM SIGPLAN Notices).

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Towards Unifying OpenMP Under the Task-Parallel Paradigm Implementation and Performance of the taskloop Construct

    Podobas, A. & Karlsson, S., 2016, OpenMP: Memory, Devices, and Tasks . Springer, Vol. 9903. p. 116-129 (Lecture Notes in Computer Science, Vol. 9903).

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • 2015

    A Scalable Prescriptive Parallel Debugging Model

    Jensen, N. B., Quarfot Nielsen, N., Lee, G. L., Karlsson, S., Legendre, M., Schulz, M. & Ahn, D. H., 2015, Proceedings of the 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2015). IEEE, p. 473-483

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Experiences with Compiler Support for Processors with Exposed Pipelines

    Jensen, N. B., Schleuniger, P., Hindborg, A. E., Walter, M. & Karlsson, S., 2015, Proceedings of the 29th International Parallel and Distributed Processing Symposium Workshops (IPDPSW 2015). IEEE, p. 137-143

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Hardware Transactional Memory Optimization Guidelines, Applied to Ordered Maps

    Bonnichsen, L. F., Probst, C. W. & Karlsson, S., 2015, Proceedings of the 13th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2015). IEEE, Vol. 3. p. 124-131

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    Open Access
    File
    1005 Downloads (Pure)
  • 2014

    A Synthesizable Multicore Platform for Microwave Imaging

    Schleuniger, P. & Karlsson, S., 2014, Reconfigurable Computing: Architectures, Tools, and Applications. Proceedings. Springer, p. 197-204 (Lecture Notes in Computer Science, Vol. 8405).

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Automatic generation of application specific FPGA multicore accelerators

    Hindborg, A. E., Schleuniger, P., Jensen, N. B., Walter, M., Brock-Nannestad, L., Bonnichsen, L. F., Probst, C. W. & Karlsson, S., 2014, Conference Record of the 48th Asilomar Conference on Signals, Systems & Computers. Matthews, M. B. (ed.). IEEE, p. 1440-1444

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Code Commentary and Automatic Refactorings using Feedback from Multiple Compilers

    Jensen, N. B., Probst, C. W. & Karlsson, S., 2014, Proceedings of the 7th Swedish Workshop on Multicore Computing (MCC'14). 4 p.

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    File
    161 Downloads (Pure)
  • Collaborative Compiler Vectorization

    Jensen, N. B., Probst, C. W. & Karlsson, S., 2014. 1 p.

    Research output: Contribution to conferencePosterResearch

    Open Access
    File
    139 Downloads (Pure)
  • Compiler Feedback using Continuous Dynamic Compilation during Development

    Jensen, N. B., Karlsson, S. & Probst, C. W., 2014, Proceedings - Workshop on Dynamic Compilation Everywhere. 12 p.

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    File
    460 Downloads (Pure)
  • DySectAPI: Scalable Prescriptive Debugging

    Jensen, N. B., Karlsson, S., Quarfot Nielsen, N., Lee, G. L., Ahn, D. H., Legendre, M. & Schulz, M., 2014. 1 p.

    Research output: Contribution to conferencePosterResearchpeer-review

    Open Access
    File
    124 Downloads (Pure)
  • DySectAPI: Scalable Prescriptive Debugging

    Jensen, N. B., Karlsson, S., Quarfot Nielsen, N., Lee, G. L., Ahn, D. H., Legendre, M. & Schulz, M., 2014. 2 p.

    Research output: Contribution to conferenceConference abstract for conferenceResearchpeer-review

    Open Access
    File
    258 Downloads (Pure)
  • ELB-trees - Efficient Lock-free B+trees

    Bonnichsen, L. F., Karlsson, S. & Probst, C. W., 2014. 1 p.

    Research output: Contribution to conferencePosterResearch

    Open Access
    File
    152 Downloads (Pure)
  • ELB-trees - Efficient Lock-free B+trees

    Bonnichsen, L. F., Karlsson, S. & Probst, C. W., 2014, Proceedings of the 10th International Summer School on Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems (ACACES 2014). 4 p.

    Research output: Chapter in Book/Report/Conference proceedingConference abstract in proceedingsResearchpeer-review

    File
    3587 Downloads (Pure)
  • Exposing MPI Objects for Debugging

    Brock-Nannestad, L., DelSignore, J., Squyres, J. M., Karlsson, S. & Mohror, K., 2014. 1 p.

    Research output: Contribution to conferencePosterResearchpeer-review

    Open Access
    File
    250 Downloads (Pure)
  • Exposing MPI Objects for Debugging

    Brock-Nannestad, L., DelSignore, J., Squyres, J. M., Karlsson, S. & Mohror, K., 2014. 2 p.

    Research output: Contribution to conferenceConference abstract for conferenceResearchpeer-review

    Open Access
    File
    162 Downloads (Pure)
  • Hardware Realization of an FPGA Processor - Operating System Call Offload and Experiences

    Hindborg, A. E. & Karlsson, S., 2014, Proceedings of the 10th International Summer School on Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems (ACACES 2014). 4 p.

    Research output: Chapter in Book/Report/Conference proceedingConference abstract in proceedingsResearchpeer-review

    File
    308 Downloads (Pure)
  • Hardware Realization of an FPGA Processor - Operating System Call Offload and Experiences

    Hindborg, A. E. & Karlsson, S., 2014. 1 p.

    Research output: Contribution to conferencePosterResearch

    Open Access
    File
    112 Downloads (Pure)
  • Hardware Realization of an FPGA Processor – Operating System Call Offload and Experiences

    Hindborg, A. E., Schleuniger, P., Jensen, N. B. & Karlsson, S., 2014, Proceedings of the 2014 Conference on Design and Architectures for Signal and Image Processing (DASIP). Morawiec, A. & Hinderscheit, J. (eds.). IEEE, 8 p.

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    File
    1099 Downloads (Pure)
  • Library Support for Resource Constrained Accelerators

    Brock-Nannestad, L. & Karlsson, S., 2014. 1 p.

    Research output: Contribution to conferencePosterResearchpeer-review

    Open Access
    File
    408 Downloads (Pure)
  • Library Support for Resource Constrained Accelerators

    Brock-Nannestad, L. & Karlsson, S., 2014, Using and Improving OpenMP for Devices, Tasks, and More: Proceedings of the 10th International Workshop on OpenMP, IWOMP 2014. DeRose, L., Supinski, B. R. D., Olivier, S. L., Chapman, B. M. & Müller, M. S. (eds.). Springer, p. 187-201 (Lecture Notes in Computer Science; No. 8766).

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • MPI Debugging with Handle Introspection

    Brock-Nannestad, L., DelSignore, J., Squyres, J. M., Karlsson, S. & Mohror, K., 2014. 4 p.

    Research output: Contribution to conferencePaperResearchpeer-review

    Open Access
    File
    195 Downloads (Pure)
  • Safe Asynchronous System Calls

    Brock-Nannestad, L. & Karlsson, S., 2014. 1 p.

    Research output: Contribution to conferencePosterResearchpeer-review

    Open Access
    File
    86 Downloads (Pure)
  • Safe Asynchronous System Calls - extended abstract

    Brock-Nannestad, L. & Karlsson, S., 2014. 4 p.

    Research output: Contribution to conferenceConference abstract for conferenceResearchpeer-review

    Open Access
    File
    113 Downloads (Pure)
  • Smart Multicore Embedded Systems

    Torquati, M. (ed.), Bertels, K. (ed.), Karlsson, S. (ed.) & Pacull, F. (ed.), 2014, Springer. 175 p.

    Research output: Book/ReportBookResearchpeer-review

  • Testing Infrastructure for Operating System Kernel Development

    Walter, M. & Karlsson, S., 2014, Proceedings of the 7th Swedish Workshop on Multicore Computing (MCC'14). 4 p.

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    File
    349 Downloads (Pure)
  • Unit Testing Framework for Operating System Kernels

    Walter, M. & Karlsson, S., 2014. 1 p.

    Research output: Contribution to conferenceConference abstract for conferenceResearchpeer-review

    Open Access
    File
    523 Downloads (Pure)
  • Unit Testing Framework for Operating System Kernels

    Walter, M. & Karlsson, S., 2014. 1 p.

    Research output: Contribution to conferencePosterResearchpeer-review

    Open Access
    File
    255 Downloads (Pure)
  • 2013

    ELB-trees an efficient and lock-free B-tree derivative

    Bonnichsen, L. F., Karlsson, S. & Probst, C. W., 2013, 2013 IEEE 6th International Workshop on Multi-/Many-core Computing Systems (MuCoCoS). IEEE, 10 p.

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • ELB-trees - Efficient Lock-free B+trees

    Bonnichsen, L. F., Karlsson, S. & Probst, C. W., 2013. 1 p.

    Research output: Contribution to conferencePosterResearch

    Open Access
    File
    84 Downloads (Pure)
  • Synthetic Aperture Radar Data Processing on an FPGA Multi-Core System

    Schleuniger, P., Kusk, A., Dall, J. & Karlsson, S., 2013, Architecture of Computing Systems – ARCS 2013: 26th International Conference, Prague, Czech Republic, February 19-22, 2013. Proceedings. Springer, p. 74-85 (Lecture Notes in Computer Science, Vol. 7767).

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Task Based Programming on Embedded Multicores

    Schleuniger, P. & Karlsson, S., 2013. 1 p.

    Research output: Contribution to conferencePosterResearch

    Open Access
    File
    74 Downloads (Pure)
  • 2012

    Design Principles for Synthesizable Processor Cores

    Schleuniger, P., McKee, S. A. & Karlsson, S., 2012, Architecture of Computing Systems – ARCS 2012: 25th International Conference Munich, Germany, February 28 – March 2, 2012 Proceedings. Springer, p. 111-122 (Lecture Notes in Computer Science, Vol. 7179).

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Guiding Programmers to Higher Memory Performance

    Jensen, N. B., Larsen, P., Ladelsky, R., Zaks, A. & Karlsson, S., 2012, Proceedings of 5th Workshop on Programmability Issues for Heterogeneous Multicores (MULTIPROG-12). 12 p.

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    File
    349 Downloads (Pure)
  • Parallelizing More Loops with Compiler Guided Refactoring

    Larsen, P., Ladelsky, R., Lidman, J., McKee, S. A., Karlsson, S. & Zaks, A., 2012, 2012 41st International Conference on Parallel Processing (ICPP). IEEE, p. 410-419 (International Conference on Parallel Processing. Proceedings).

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Software Managed Cache for Parallel Systems

    Schleuniger, P. & Karlsson, S., 2012. 1 p.

    Research output: Contribution to conferencePosterResearch

    Open Access
    File
    101 Downloads (Pure)
  • 2011

    Adapt or Become Extinct! The Case for a Unified Framework for Deployment-Time Optimization

    Goumas, G., McKee, S. A., Själander, M., Gross, T. R., Karlsson, S., Probst, C. W. & Zhang, L., 2011, EXADAPT '11 Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era. University of Strathclyde

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    File
    449 Downloads (Pure)
  • Automatic Loop Parallelization via Compiler Guided Refactoring

    Larsen, P., Ladelsky, R., Lidman, J., McKee, S. A., Karlsson, S. & Zaks, A., 2011, Kgs. Lyngby, Denmark: Technical University of Denmark. (IMM-Technical Report-2011; No. 12).

    Research output: Book/ReportReportResearch

    File
    546 Downloads (Pure)
  • Comparing the Overhead of Lock-based and Lock-free Implementations of Priority Queues

    Passas, S. & Karlsson, S., 2011, Proceedings of Forth Workshop on Programmability Issues for Heterogeneous Multicores.

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

  • Compiler Driven Code Comments and Refactoring

    Larsen, P., Ladelsky, R., Karlsson, S. & Zaks, A., 2011, Proceedings of Forth Workshop on Programmability Issues for Heterogeneous Multicores.

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review