When you use mpirun in stand-alone mode, you provide it the host names to be used by the MPI job. To achieve better resource utilization, you can have LSF manage the allocation of hosts, coordinating the start-up phase with mpirun. This is done by preceding the regular HP MPI mpirun command with:
% bsub pam -mpi
Example: To run a single-host job and have the LSF Batch system select the host, the command:
% mpirun -np 14 a.out
% bsub pam -mpi mpirun -np 14 a.out
Example: To run a multi-host job and have the LSF Batch system select the hosts, the command:
% mpirun -f appfile
% bsub pam -mpi mpirun -f appfile
where appfile contains the following entries:
-h foo -np 8 a.out -h bar -np 4 b.out -h foo -np 2 c.out
In this example, the hosts foo and bar are treated as symbolic names and refer to the actual hosts that the LSF Batch system allocates to the job. The a.out and c.out processes are guaranteed to run on the same host. The b.out processes may run on a different host, depending on the resources available and the LSF Batch system scheduling algorithms.
For a complete list of mpirun options and environment variable controls, refer to the mpirun man page and the HP MPI User's Guide version 1.4.
The -mpi argument on the bsub and pam command-line is a replacement for mpirun in the HP environment. Everything after -mpi shall be exactly as it would normally appear if mpirun were being used.
Example: To run a the a.out job and have the LSF Batch system select the host, the command:
% mpirun -np 4 a.out
% mpirun pam -mpi -np 4 a.out
Example: To run a multihost job and have the LSF Batch system select the hosts, the following command:
% mpirun -f appfile
% bsub pam -mpi -f appfile
where appfile contains the following entries:
foo -np 4 a.out bar -np 4 b.out foo -np 2 c.out
For a complete list of mpirun options and environment variable controls refer to the mpirun man page.
When running LSF Batch jobs on Sun platforms, you can include the Sun-specific argument -sunhpc on the bsub command line, after any other bsub arguments. The following arguments to -sunhpc provide additional control over bsub behavior in a Sun HPC environment.
Specify the number of processes to run. Note that the bsub -n argument specifies the number of CPUs to be used for the job.
Example: To start a 48-process interactive job on PAM-enabled queue hpc that will wrap over at least 4, and as many as 16, CPUs:
% bsub -I -n 4,16 -q hpc -sunhpc -n 48 jobname
Setting the minimum number of CPUs to a number greater than 1 raises the possibility that, if there are fewer CPUs available than the minimum number you specify, the job may fail to start. In this example, if fewer than 4 CPUs are available, the job will not start. You can avoid this potential problem by setting the minimum number of CPUs to 1. However, this introduces the potential cost to performance of having the processes wrapped over a smaller number of CPUs.
Specify the PAM address of another job with which the new job should colocate. The PAM address is the TCP socket used for communications between the job and PAM.
Example: To start a 4-CPU interactive job on PAM-enabled queue hpc:
% bsub -I -n 4 -q hpc -sunhpc -P Athos:123 jobname
The new job is colocated with the job whose PAM is running on host Athos, using port 123.
Specify the job ID of another job with which the new job should colocate.
-J job_name
Specify the job name of another job with which the new job should colocate.
Specify that the job is to be spawned in the STOPPED state.
To identify processes in the STOPPED state, issue the ps command with the -el argument:
orpheus 215 => ps -el F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD |
Here, the sched command is in STOPPED state, as indicated by the T entry in the S (State) column.
Note that, when spawning a process in the STOPPED state under LSF, the name of your program will not appear in the ps output. Instead, the stopped process will be identified as a RES daemon.
Example: To start a 1-CPU interactive job on PAM-enabled queue hpc, in the STOPPED state:
% bsub -I -n 1 -q hpc -sunhpc -s jobname