Performance Analysis of OpenSHMEM Applications with TAU Commander

The TAU Performance System (TAU) is a powerful and highly versatile profiling and tracing tool ecosystem for performance engineering of parallel programs. Developed over the last twenty years, TAU has evolved with each new generation of HPC systems and scales efficiently to hundreds of thousands of cores. TAU’s organic growth has resulted in a loosely coupled software toolbox such that novice users first encountering TAU’s complexity and vast array of features are often intimidated and easily frustrated. To lower the barrier to entry for novice TAU users, ParaTools and the US Department of Energy have developed “TAU Commander,” a performance engineering workflow manager that facilitates a systematic approach to performance engineering, guides users through common profiling and tracing workflows, and offers constructive feedback in case of error. This work compares TAU and TAU Commander workflows for common performance engineering tasks in OpenSHMEM applications and demonstrates workflows targeting two different SHMEM implementations, Intel Xeon “Haswell” and “Knights Landing” processors, direct and indirect measurement methods, callsite, profiles, and traces.