This is a demonstration of how to use MinGW and the Tau Performance System to to cross compile a parallel Linux application for Windows 64-bit with Microsoft MPI.

This work was sponsored by the Microsoft Developer and Platform Evangelism Team.

Download

Compile

These instructions will guide you through cross compiling Tachyon on a Linux machine for Windows 64-bit with Microsoft MPI.$TACHYON_BUILD is the folder on your Linux machine in which Tachyon will be built.

  1. Install TAU and the MinGW-w64 cross compiler for Windows 64-bit.

  2. Download the tachyon source code to $TACHYON_BUILD.

  3. Download this patch to $TACHYON_BUILD.

  4. Unpack the source code:
  5. Patch the source code:
  6. Specify your MPI installation directory (e.g. $HOME/software/ms-hpc-2008-sp2):

  7. Specify your TAU Makefile:
  8. Cross-compile Tachyon:
  9. Package Tachyon for transfer to Windows

Install

You will need both the Tachyon executable file and the supporting MinGW-w64 and TAU libraries. The patch file creates a script in the tachyon folder that will automatically collect all the necessary files into a zip file called “tachyon.zip”.%TACHYON_HOME% in these instructions is the Tachyon installation directory on the Windows machine.

  1. Transfer $TACHYON_BUILD/tachyon.zip (or one of the pre-built versions) to your Windows cluster.

  2. Extract the zip file to %TACHYON_HOME%.

  3. Share %TACHYON_HOME% to all cluster nodes with read/write privileges.

Examples

%TACHYON_HOME%\scenes contains many examples you can use to test your installation. Use the job command to submit new jobs to your cluster. Each of the following examples shows the rendered image and the command executed to produce that image. These examples were executed on a 32-node Cray cluster. Each node has dual-socket 3.0 GHz Harpertown quad core CPUs and 16GB RAM. The interconnect fabric is 20 Gbps Infiniband. %TACHYON_HOME% is \\cray03\tachyon

STMVAO-WHITE

stmvao-white

 

Mean Function Time Over All Cluster Nodes

 

Click for larger image

DNA

dna

  • Performance results in Paraprof Packed Profile (PPK) format: dna.ppk

 

Mean Function Time Over All Cluster Nodes

 

Click for larger image

FOG

fog

  • Performance results in Paraprof Packed Profile (PPK) format: fog.ppk

 

Mean Function Time Over All Cluster Nodes

 

Click for larger image

Questions

  • Why does instrumentation make Tachyon larger?
    • These distributions use compiler-based instrumentation, which relies on debugging symbols to discover the name and source code location of a function when it is called. Debugging symbols contribute significantly to the program size.
  • Why does instrumentation make Tachyon slower?
    • These distributions use compiler-based instrumentation, which parses the program’s debug symbols at runtime to determine the name of a function when it is called. TAU implements many clever tricks to make this as fast as possible, but the overhead is still significant. TAU’s source-based instrumentation features will eliminate this overhead.
  • Why did you use compiler-based instrumentation instead of source-based instrumentation?
    • Ideally we would have used source-based instrumentation. Source-based instrumentation works by parsing the program source code before it is compiled and inserting calls to the TAU profiling API. This reduces runtime overhead because the function name and call site are resolved at compile time. However, the Windows API headers contain Microsoft-specific syntax that TAU is unable to parse. MinGW implements workarounds in its preprocessor to cope with this syntax, and we are working on porting these workarounds to TAU’s parser.