Abstract
Original language | English |
---|---|
Title of host publication | Proceedings of the 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2015) |
Publisher | IEEE |
Publication date | 2015 |
Pages | 473-483 |
ISBN (Print) | 978-1-4799-8648-4 |
DOIs | |
Publication status | Published - 2015 |
Event | 29th IEEE International Parallel and Distributed Processing Symposium - Hyderabad, India Duration: 25 May 2015 → 29 May 2015 Conference number: 29 http://www.ipdps.org/ipdps2015/2015_advance_program.html http://www.ipdps.org/ |
Conference
Conference | 29th IEEE International Parallel and Distributed Processing Symposium |
---|---|
Number | 29 |
Country | India |
City | Hyderabad |
Period | 25/05/2015 → 29/05/2015 |
Internet address |
Cite this
}
A Scalable Prescriptive Parallel Debugging Model. / Jensen, Nicklas Bo; Quarfot Nielsen, Niklas ; Lee, Gregory L.; Karlsson, Sven ; Legendre, Matthew ; Schulz, Martin; Ahn, Dong H.
Proceedings of the 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2015). IEEE, 2015. p. 473-483.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
TY - GEN
T1 - A Scalable Prescriptive Parallel Debugging Model
AU - Jensen, Nicklas Bo
AU - Quarfot Nielsen, Niklas
AU - Lee, Gregory L.
AU - Karlsson, Sven
AU - Legendre, Matthew
AU - Schulz, Martin
AU - Ahn, Dong H.
PY - 2015
Y1 - 2015
N2 - Debugging is a critical step in the development of any parallel program. However, the traditional interactive debugging model, where users manually step through code and inspect their application, does not scale well even for current supercomputers due its centralized nature. While lightweight debugging models, which have been proposed as an alternative, scale well, they can currently only debug a subset of bug classes. We therefore propose a new model, which we call prescriptive debugging, to fill this gap between these two approaches. This user-guided model allows programmers to express and test their debugging intuition in a way that helps to reduce the error space. Based on this debugging model we introduce a prototype implementation embodying this model, the DySectAPI, allowing programmers to construct probe trees for automatic, event-driven debugging at scale. In this paper we introduce the concepts behind DySectAPI and, using both experimental results and analytical modelling, we show that the DySectAPI implementation can run with a low overhead on current systems. We achieve a logarithmic scaling of the prototype and show predictions that even for a large system the overhead of the prescriptive debugging model will be small.
AB - Debugging is a critical step in the development of any parallel program. However, the traditional interactive debugging model, where users manually step through code and inspect their application, does not scale well even for current supercomputers due its centralized nature. While lightweight debugging models, which have been proposed as an alternative, scale well, they can currently only debug a subset of bug classes. We therefore propose a new model, which we call prescriptive debugging, to fill this gap between these two approaches. This user-guided model allows programmers to express and test their debugging intuition in a way that helps to reduce the error space. Based on this debugging model we introduce a prototype implementation embodying this model, the DySectAPI, allowing programmers to construct probe trees for automatic, event-driven debugging at scale. In this paper we introduce the concepts behind DySectAPI and, using both experimental results and analytical modelling, we show that the DySectAPI implementation can run with a low overhead on current systems. We achieve a logarithmic scaling of the prototype and show predictions that even for a large system the overhead of the prescriptive debugging model will be small.
U2 - 10.1109/IPDPS.2015.15
DO - 10.1109/IPDPS.2015.15
M3 - Article in proceedings
SN - 978-1-4799-8648-4
SP - 473
EP - 483
BT - Proceedings of the 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2015)
PB - IEEE
ER -