LAM MPI
There's a very good section of the LAM FAQ on debugging. This is going to simply repeat its advice and add a few hints about making it work on local machines.
You need to make a simple shell script that launches an xterm running
your debugger. The following will do:
#!/bin/sh
xterm -e gdb "$@"
Adjust as necessary to use the correct debugger. Save it as something
like run_gdb.sh
You then start your LAM environment with lamboot in the usual way
(see LAM docs) and do
mpirun C run_gdb.sh /path/tp/mpi/program
A window will appear for each MPI process, and you can now drive the
debugger in each window in the usual way.
Getting this to work on the clusters is slightly trickier. First, be
sure to log into the cluster with X forwarding enabled. Second, you need
to be on a cluster with a new enough version of qsub that it
supports the -X flag (check the manpage). Start an interactive
job in the appropriate queue. Something like this will do:
qsub -X -I -qt2
You can't use lamwrapper with the debugger, so you need to
start LAM up by hand (see LAM docs, and look in the filename in the
variable $PBS_NODEFILE to see which nodes you have been
assigned). Once you've lambooted successfully, launch the debugger as
described above.
Intel MPI
This is documented in the Intel Cluster
Toolkit documentation.
Others
MPI-CH supports debugging; see the User's
Guide.
|