Towards Unifying OpenMP Under the Task-Parallel Paradigm Implementation and Performance of the taskloop Construct

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedings – Annual report year: 2017Researchpeer-review

View graph of relations

OpenMP 4.5 introduced a task-parallel version of the classical thread-parallel for-loop construct: the taskloop construct. With this new construct, programmers are given the opportunity to choose between the two parallel paradigms to parallelize their for loops. However, it is unclear where and when the two approaches should be used when writing efficient parallel applications.In this paper, we explore the taskloop construct. We study performance differences between traditional thread-parallel for loops and the new taskloop directive. We introduce an efficient implementation and compare our implementation to other taskloop implementations using micro-and kernel-benchmarks, as well as an application. We show that our taskloop implementation on average results in a 3.2% increase in peak performance when compared against corresponding parallel-for loops.
Original languageEnglish
Title of host publicationOpenMP: Memory, Devices, and Tasks
Volume9903
PublisherSpringer
Publication date2016
Pages116-129
ISBN (Print)978-3-319-45549-5
DOIs
Publication statusPublished - 2016
Event12th International Workshop on OpenMP - Nara, Japan
Duration: 5 Oct 20167 Oct 2016

Conference

Conference12th International Workshop on OpenMP
CountryJapan
CityNara
Period05/10/201607/10/2016
SeriesLecture Notes in Computer Science
Volume9903
ISSN0302-9743
CitationsWeb of Science® Times Cited: No match on DOI

ID: 134981577