https://www.dontknow.de
https://www.dontknow.de

Publications

Journals

  1. Bronis R. de Supinski, Thomas R.W. Scogland, Alejandro Duran, Michael Klemm, Sergi Mateo, Stephen L. Olivier, Christian Terboven, and Timothy Mattson. The Ongoing Evolution of OpenMP. Proceedings of the IEEE, 106(11):2004-2019, November 2018.
  2. Bo Peng, Niranjan Govinda, Edoardo Aprà, Michael Klemm, Jeff R. Hammond, and Karol Kowalski. Coupled Cluster Studies of Electron Affinities and Ionozation Potentials of Single Walled Carbon Nanotubes. Journal of Physical Chemistry A, 121(6):1328-1335, February 2017.
  3. Alexander Heinecke, Michael Klemm, and Hans-Joachim Bungartz. From GPGPUs to Many-Core: NVIDIA Fermi* and Intel® Many Integrated Core Architecture. Computing in Science and Engineering, 14(2):78-83, March-April 2012.
  4. Jean Christophe Beyler, Michael Klemm, Philippe Clauss, and Michael Philippsen. A Meta-predictor Framework for Prefetching in Object-based DSMs. Concurrency and Computation: Practice and Experience, 21(14):1789-1803, September 2009.
  5. Michael Klemm, Matthias Bezold, Stefan Gabriel, Ronald Veldema, and Michael Philippsen. Reparallelization Techniques for Migrating OpenMP Codes in Computational Grids. Concurrency and Computation: Practice and Experience, 21(3):281-299, March 2009.
  6. Michael Klemm, Ronald Veldema, and Michael Philippsen. Cluster Research at the Programming Systems Group. High Performance Computing at RRZE, pages 30-31, 2008.
  7. Michael Klemm, Matthias Bezold, Ronald Veldema, and Michael Philippsen. JaMP: An Implementation of OpenMP for a Java DSM. Concurrency and Computation: Practice and Experience, 18(19):2333-2352, April 2007.

 

Conferences and Workshops

  1. Vishakha Agrawal, Michael J. Voss, Pablo Reble, Vasanth Tovinkere, Jeff Hammond, and Michael Klemm. Visualization of OpenMP Task Dependencies using Intel Advisor Flow Graph Analyzer. In Bronis R. de Supinski, Pedro Valero-Lara, Xavier Martorell, Sergi Mateo Bellido, and Jesus Labarta, editors, Evolving OpenMP for Evolving Architectures, pages 175-188, Barcelona, Spain, September 2018. LNCS 11128.
  2. Jannis Klinkenberg, Philipp Samfass, Christian Terboven, Alejandro Duran, Michael Klemm, Xavier Teruel, Sergi Mateo, Stephen L. Olivier, and Matthias S. Müller. Assessing Task-to-Data Affinity in LLVM OpenMP. In Bronis R. de Supinski, Pedro Valero-Lara, Xavier Martorell, Sergi Mateo Bellido, and Jesus Labarta, editors, Evolving OpenMP for Evolving Architectures, pages 236-251, Barcelona, Spain, September 2018. LNCS 11128.
  3. Josef Weidendorfer, Carsten Trinitis, Sebastian Rückerl, and Michael Klemm. Cache-Partitionierung im Kontext von Co-Scheduling. In Gesellschaft für Informatik e.V., editor, Parallel-Algorithmen und Rechnerstrukturen, number 24 in Mitteilungen, January 2018. ISSN 0177-0454.
  4. Ravi Ojha, Prasad Pawar, Sonia Gupta, Michael Klemm, and Manoj Nambiar. Performance Optimization of OpenFOAM on Clusters of Intel Xeon Phi Processors. In Proceedings of the 24th IEEE International Conference on High Performance Computing Workshops, pages 51-59, Jaipur, India, December 2017.
  5. Mikko Byckling, Juhani Kataja, Michael Klemm, and Thomas Zwinger. OpenMP SIMD Vectorization and Threading of the Elmer Finite Element Software. In Bronis R. de Supinski, Stephen L. Olivier, Christian Terboven, Barbara M. Chapman, and Matthias S. Müller, editors, Scaling OpenMP for Exascale Performance and Portability, pages 123-137, September 2017. LNCS 10468.
  6. Jonas Hahnfeld, Tim Cramer, Michael Klemm, Christian Terboven, and Matthias S. Müller. A Pattern for Overlapping Communication and Computation with OpenMP Target Directives. In Bronis R. de Supinski, Stephen L. Olivier, Christian Terboven, Barbara M. Chapman, and Matthias S. Müller, editors, International Workshop on OpenMP, pages 325-337, September 2017. LNCS 10468.
  7. Eric J. Bylaska, Mathias Jacquelin, Wibe A. de Jong, Jeff R. Hammond, and Michael Klemm. Performance Evaluation of NWChem Ab-Initio Molecular Dynamics (AIMD) Simulations on the Intel Xeon Phi Processor. In High Performance Computing, pages 404-418, Frankfurt, Germany, June 2017. Best paper, LNCS 10524.
  8. Matthias Noack, Florian Wende, Georg Zitzlsberger, Michael Klemm, and Thomas Steinke. KART - A Runtime Compilation Library for Improving HPC Application Performance. In High Performance Computing, pages 389-403, Frankfurt, Germany, June 2017. Best paper.
  9. Christian Terboven, Jonas Hahnfeld, Xavier Teruel, Sergi Mateo, Alejandro Duran, Michael Klemm, Stephen L. Olivier, and Bronis R. de Supinski. Approaches for Task Affinity in OpenMP. In Naoya Maruyama, Bronis R. de Supinski, and Mohamed Wahib, editors, Proceedings of the 12th International Workshop on OpenMP, pages 102-115, October 2016. LNCS 9903.
  10. Florian Wende, Matthias Noack, Thomas Steinke, Michael Klemm, Chris J. Newburn, and Georg Zitzlsberger. Portable SIMD Performance with OpenMP* 4.x Compiler Directives. In Euro-Par 2016: Parallel Processing, pages 264-277, Grenoble, France, August 2016. LNCS 9833.
  11. Michael Klemm, Freddie Witherden, and Peter Vincent. Using the pyMIC Offload Module in PyFR. In Proceedings of the 8th European Conference on Python in Science, Cambridge, UK, July 2016. arXiv:1607.00844 [cs.MS].
  12. Nishant Agrawal, Ambuj Pandey, Ravi Ojha, Rihab Abdul Razak, Paul Edwards, and Michael Klemm. Performance Evaluation of OpenFOAM with MPI-3 RMA Routines on Intel Xeon Processors and Intel Xeon Phi Coprocessors. Bordeaux, France, September 2015. Poster at Euro-MPI 2015.
  13. Barna L. Bihari, Hansang Bae, James Cownie, Michael Klemm, Christian Terboven, and Lori Diachin. On the Algorithmic Aspects of Using OpenMP Synchronization Mechanisms II: User-Guided Speculative Locks. In Christian Terboven, Bronis R. de Supinski, Pablo Reble, and Matthias S. Müller, editors, Proceedings of the 11th International Workshop on OpenMP, pages 133-148, Aachen, Germany, September 2015. LNCS 9342.
  14. Gaurav Bansal, Anand Deshpande, Paul Edwards, Alexander Heinecke, Michael Klemm, Dheevatsa Mudigere, Elmoustapha Ould ahmed vall, Mikhail Smelyanskiy, Michael Steyer, Nishant Agrawal, Ravi Ojha, Ambuj Pandey, Rihab Abdul Razak, Juan J. Alonso, Thomas D. Economon, Francisco Palacios, and David Keyes. Accelerating Computational Fluid Dynamics Codes on Multi-/Many-Core Intel Platforms. In 27th International Conference on Parallel Computational Fluid Dynamics, Montreal, Quebec, Canada, May 2015. Extended abstract.
  15. Bernd Hentschel, Jens Henrik Göbbert, Paul Springer, Andrea Schnorr, and Torsten W. Kuhlen. Packet-Oriented Streamline Tracing on Modern SIMD Architectures. In Eurographics Symposium on Parallel Graphics and Visualization, pages 43-52, Cagliari, Sardinia, Italy, May 2015.
  16. Edoardo Aprà, Michael Klemm, and Karol Kowalski. Efficient Implementation of Many-body Quantum Chemical Methods on the Intel Xeon Phi Coprocessor. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, New Orleans, LA, November 2014.
  17. Michael Klemm and Jussi Enkovaara. pyMIC: A Python Offload Module for the Intel Xeon Phi Coprocessor. In 4th Workshop on Python for High Performance and Scientific Computing, New Orleans, LA, November 2014. Available online at http://www.dlr.de/sc/Portaldata/15/Resources/dokumente/pyhpc2014/submissions/pyhpc2014_submission_8.pdf.
  18. Hansang Bae, James H. Cownie, Michael Klemm, and Christian Terboven. A User-guided Locking API for the OpenMP* Application Program Interface. In Luiz DeRose, Bronis R. de Supinski, Stephen L. Olivier, Barbara M. Chapman, and Matthias S. Müller, editors, Using and Improving OpenMP for Devices, Tasks, and More, pages 173-186, Salvador, Brazil, September 2014. LNCS 8766.
  19. Xavier Teruel, Michael Klemm, Kelvin Li, Xavier Martorell, Stephen L. Olivier, and Christian Terboven. A Proposal for Task-Generating Loops in OpenMP. In A.P. Rendell et al., editor, Proceedings of the 9th International Workshop on OpenMP, pages 1-14, Canberra, Australia, September 2013. LNCS 8122.
  20. Alexander Heinecke, Dirk Pflüger, Dmitry Budnikov, Michael Klemm, Arik Narkis, Maxim Shevtsov, and Ayal Zaks. Demonstrating Performance Portability of a Custom OpenCL Data Mining Application to the Intel Xeon Phi Coprocessor. In Proceedings of the International Workshop on OpenCL 2013 & 2014, Atlanta, GA, May 2013.
  21. Tim Cramer, Dirk Schmidl, Michael Klemm, and Dieter an Mey. OpenMP Programming on Intel Xeon Phi Coprocessors: An Early Performance Comparison. In MARC 2012, pages 38-44, Aachen, Germany, November 2012. Available online at http://nbn-resolving.de/urn/resolver.pl?urn=urn:nbn:de:hbz:82-opus-43835.
  22. Michael Klemm, Alejandro Duran, Xinmin Tian, Hideki Saito, Diego Caballero, and Xavier Martorell. Extending OpenMP with Vector Constructs for Modern Multicore SIMD Architectures. In B.M. Chapman et al., editor, Proceedings of the 8th International Workshop on OpenMP, pages 59-72, Rome, Italy, June 2012. LNCS 7312.
  23. Hans Pabst, Bev Bachmayer, and Michael Klemm. Performance of a Structure-detecting SpMV using the CSR Matrix Representation. In Proceedings of the 11th Symposium Parallel and Distributed Computing, pages 3-10, Munich, Germany, June 2012.
  24. Alexander Heinecke, Michael Klemm, Hans Pabst, and Dirk Pflüger. Towards High-performance Implementations of a Custom HPC Kernel using Intel® Array Building Blocks. In Facing the Multicore Challenge II, pages 36-47, Karlsruhe, Germany, May 2012. Springer. LNCS 7174.
  25. Alexander Heinecke, Michael Klemm, Dirk Pflüger, Arndt Bode, and Hans-Joachim Bungartz. Extending a Highly Parallel Data Mining Algorithm to the Intel® Many Integrated Core Architecture. In Euro-Par 2011: Parallel Processing Workshops, pages 375-384, Bordeaux, France, August 2011. LNCS 7156.
  26. Alejandro Duran, Roger Ferrer, Michael Klemm, Bronis R. de Supinski, and Eduard Ayguade. A Proposal for User-defined Reductions in OpenMP. In Proceedings of the 6th International Workshop on OpenMP, pages 43-55, Tsukuba, Japan, June 2010. LNCS 6132.
  27. Michael Wong, Michael Klemm, Alejandro Duran, Tim Mattson, Grant Haab, Bronis R. de Supinski, and Andrey Churbanov. Towards an Error Model for OpenMP. In Proceedings of the 6th International Workshop on OpenMP, pages 70-82, Tsukuba, Japan, June 2010. LNCS 6132.
  28. Georg Dotzler, Ronald Veldema, and Michael Klemm. JCudaMP: OpenMP/Java on CUDA. In Victor Pankratius, Michael Philippsen, and ACM/IEEE, editors, Proceedings of the Third International Workshop on Multicore Software Engineering, pages 10-17, Cape Town, South Africa, May 2010. No page numbers.
  29. Tobias Werth, Tobias Floßmann, Michael Klemm, Dominic Schell, Ulrich Weigand, and Michael Philippsen. Dynamic Code Footprint Optimization for the IBM Cell Broadband Engine. In Adam Porter, Lawrence Votta, and Victor Pankratius, editors, Proceedings of the ICSE Workshop on Multicore Software Engineering, pages 64-72, Vancouver, Canada, May 2009.
  30. Michael Klemm, Ronald Veldema, and Michael Philippsen. An Automatic Cost-based Framework for Seamless Application Migration in Grid Environments. In Teofilo Gonzalez, editor, Proceedings of the 20th IASTED International Conference on Parallel and Distributed Computing and Systems, pages 219-224, November 2008.
  31. Jean Christophe Beyler, Michael Klemm, Philippe Clauss, and Michael Philippsen. Automatic Prefetching with Binary Code Rewriting in Object-based DSMs. In Emilio Luque, Tomàs Margalef, and Domingo Benítez, editors, Proceedings of the Euro-Par 2008 Conference, pages 643-653, Heidelberg, Germany, August 2008. Springer.
  32. Michael Klemm, Jean Christophe Beyler, Ronny T. Lampert, Michael Philippsen, and Philippe Clauss. Esodyp+: Prefetching in the Jackal Software DSM. In Anne-Marie Kermarrec, Luc Bougé, and Thierry Priol, editors, Proceedings of the Euro-Par 2007 Conference, pages 563-573, New York, NY, USA, August 2007. Springer.
  33. Michael Klemm, Matthias Bezold, Stefan Gabriel, Ronald Veldema, and Michael Philippsen. Reparallelization and Migration of OpenMP Programs. In Bruno Schulze, Rajkuma Buyya, Philippe Navaux, Walfredo Cirne, and Vinod Rebello, editors, Proceedings of the 7th International Symposium on Cluster Computing and the Grid, pages 529-537, New York, NY, USA, May 2007. IEEE Computer Society.
  34. Michael Klemm and Michael Philippsen. Reparallelisierung und Migration von OpenMP-Applikationen. In Gesellschaft für Informatik e.V., editor, Parallel-Algorithmen und Rechnerstrukturen, number 24 in Mitteilungen, pages 65-76, May 2007. ISSN 0177-0454.
  35. Michael Klemm, Ronald Veldema, Matthias Bezold, and Michael Philippsen. A Proposal for OpenMP for Java. In Université de Reims, editor, Proceedings of the 2nd International Workshop on OpenMP, June 2006. Published on CD-ROM.
  36. Michael Klemm, Matthias Bezold, Ronald Veldema, and Michael Philippsen. JaMP: An Implementation of OpenMP for a Java DSM. In Manuel Arenaz, Ramón Doallo, Basilio B. Fraguela, and Juan Touriño, editors, Proceedings of the 12th Workshop on Compilers for Parallel Computers, pages 242-255, January 2006.
  37. Michael Klemm, Ronald Veldema, and Michael Philippsen. Latency Reduction in Software-DSMs by Means of Dynamic Function Splicing. In Teofilo Gonzales and IASTED, editors, Proceedings of the 16th IASTED International Conference on Parallel and Distributed Computing and Systems, pages 362-367, Cambridge, MA, USA, November 2004.

 

Books and Book Chapters

  1. Michael Klemm and Bronis R. de Supinski, editors. OpenMP Application Programming Interface Specification Version 5.0. OpenMP Architecture Review Board, February 2019. ISBN 978-1-79575988-5.
  2. Eric J. Bylaska, Edoardo Aprà, Karol Kowalski, Mathias Jacquelin, Wibe A. de Jong, Abhinav Vishnu, Bruce Palmer, Tjerk P. Straatsma, Jeff R. Hammond, and Michael Klemm. Exascale Scientic Applications: Scalability and Performance Portability, chapter Transitioning NWChem to the Next Generation of Many Core Machines. November 2017. ISBN 978-1-138-19754-1.
  3. Michael Klemm and Christopher Dahnken. Co-scheduling of HPC Applications, chapter Recent Processor Technologies and Co-Scheduling. January 2017. ISBN 978-1-61499-729-0 (paperback), ISBN 978-1-61499-730-6 (electronic).
  4. Jussi Enkovaara, Michael Klemm, and Freddie Witherden. High Performance Parallelism Pearls Volume Two: Multicore and Many-core Programming Approaches, chapter High Performance Python Offloading, pages 243-269. Morgan Kaufman, Elsevier, Inc., August 2015. ISBN 978-0-12-803819-2.
  5. Edoardo Aprà, Jeff R. Hammond, Michael Klemm, and Karol Kowalski. High Performance Parallelism Pearls Volume One: Multicore and Many-core Programming Approaches, chapter NWChem: Quantum Chemistry Simulations at Scale, pages 201-223. Morgan Kaufman, Elsevier, Inc., November 2014. ISBN 978-0-12-802118-7.
  6. Florian Wende, Michael Klemm, Thomas Steinke, and Alexander Reinefeld. High Performance Parallelism Pearls, chapter Concurrent Kernel Offloading, pages 287-306. Morgan Kaufman, Elsevier, Inc., November 2014. ISBN 978-0-12-802118-7.
  7. Alexander Supalov, Andrey Semin, Michael Klemm, and Christopher Dahnken. Optimizing HPC Applications with Intel® Cluster Tools. Apress Media, October 2014. ISBN 978-1-4302-6496-5 (paperback), ISBN 978-1-4302-6497-2 (electronic).
  8. Michael Klemm. Reparallelization and Migration of OpenMP Applications in Grid Environments, March 2009. Shaker Verlag, Aachen. Doctoral Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany.

 

Technical Reports

  1. Matthias Noack, Florian Wende, Georg Zitzlsberger, Michael Klemm, and Thomas Steinke. KART - A Runtime Compilation Library for Improving HPC Application Performance. Technical report, ZIB, Takustr.7, 14195 Berlin, October 2016. ZIB Report 16-48. Available at https://opus4.kobv.de/opus4-zib/frontdoor/index/index/docId/6073.

 

Miscellaneous Publications

  1. Michael Klemm and Christian Terboven. From Task ’til Dawn’. Heise Developer, June 2018. Available online at https://www.heise.de/developer/artikel/From-Task-til-Dawn-Tasks-versus-Threads-4075780.html.
  2. Michael Klemm, Matthijs van Waveren, and James Cownie. Twenty Years of the OpenMP API. Scientific Computing World, 159:16-18, April/May 2018. Available online at https://www.scientific-computing.com/feature/twenty-years-openmp-api.
  3. Michael Klemm and Dirk Schmidl. Parallele Performance: So geht’s. Informatik Aktuell, June 2017. Available online at https://www.informatik-aktuell.de/entwicklung/methoden/parallele-performance-so-gehts.html.
  4. Michael Klemm, Alejandro Duran, Ravi Narayanaswamy, Xinmin Tian, and Terry Wilmarth. The Present and Future of the OpenMP API Specification. Intel Parallel Universe Magazine, 27:5-16, January 2017. Available online at https://software.intel.com/sites/default/files/managed/75/b0/parallel-universe-issue-27.pdf.
  5. Michael Klemm and Christian Terboven. OpenMP API Version 4.5 - A Standard Evolves. Intel Parallel Universe Magazine, 24:23-31, March 2016. Available online at https://software.intel.com/sites/default/files/managed/65/20/v3-theparalleluniverse-issue-24.pdf.
  6. Michael Klemm and Christian Terboven. OpenMP 4.5: Eine kompakte Übersicht zu den Neuerungen. Heise Developer, November 2015. Available online at http://www.heise.de/developer/artikel/OpenMP-4-5-Eine-kompakte-Uebersicht-zu-den-Neuerungen-3020235.html.
  7. Michael Klemm. Berechnung abladen erlaubt. Entwickler, 2015(1):74-79, January/February 2015.
  8. James H. Cownie, Alejandro Duran, Michael Klemm, and Luke Lin. An OpenMP Timeline. Intel Parallel Universe Magazine, 18:41, June 2014. Available online at https://software.intel.com/sites/default/files/managed/6a/78/parallel_mag_issue18.pdf.
  9. Michael Klemm and Christian Terboven. Full Throttle: OpenMP 4.0. Intel Parallel Universe Magazine, 16:6-16, November 2013. Available online at http://download-software.intel.com/sites/default/files/managed/64/cc/parallel_mag_issue16.pdf.
  10. Michael Klemm and Christian Terboven. Gas gegeben - Die wichtigsten Neuerungen von OpenMP 4.0. Heise Developer, July 2013. Available online at http://www.heise.de/developer/artikel/Die-wichtigsten-Neuerungen-von-OpenMP-4-0-1915844.html.
  11. Michael Klemm and Maxim Shetsov. Für alle – Heterogene Programmierung mit OpenCL. iX, 1(2013):145-152, December 2012.
  12. Michael Klemm. Bremse oder Gaspedal? – Performanceanalysetools in Einsatz. Entwickler, 6(2012):83-89, November/December 2012.

Contact

 

e-mail:

michael@dontknow.de

 

Matrix:

@michael.klemm:matrix.org

 

LinkedIn:

http://bit.ly/mklemm

 

orcid:

0000-0002-8634-4634

 

GPG public key

 

S/MIME public key

Druckversion | Sitemap
© Dr.-Ing. Michael Klemm