Improving Application Performance Using the TAU Performance System

John Linford and Sameer Shende, ParaTools, Inc.

Business and Engineering Complex (BEC), University of Alabama at Birmingham, Tuesday, July 16, 2013, 10am - 11:30am.


The TAU Performance System is a powerful toolkit for performance measurement and analysis of software written in a variety of languages and executing at all scales. This presentation will introduce TAU's profiling, tracing, and debugging support with focus on performance data collection, analysis, and program performance optimization. We will cover performance evaluation of parallel programs written in Python, Fortran, C++, C, and UPC, using MPI, and other runtime layers such as CUDA, OpenCL, SHMEM, and OpenMP. We will also demonstrate TAU's techniques for program instrumentation including automatic instrumentation of source code, compiler-based instrumentation, binary re-writing, library preloading for CUDA instrumentation, and native and offloading modes for Intel Xeon Phi coprocessors. We will demonstrate how to gather performance data showing MPI timings, GPGPU transfers, runtime bounds checking, I/O and memory, and hardware performance counters from PAPI. TAU's support for memory debugging, tracking callstacks at the point of program failure to isolate runtime faults, and I/O evaluation will also be demonstrated.