7. Run A Program
Once you can do all the tests shown above, you should be able to run
a program. From here on in, the instructions are lam specific.
Go back to the head node, log in as wolf, and enter the following
commands:
cat > /nnt/wolf/lamhosts
wolf01
wolf02
wolf03
wolf04
<control d> |
Go to the lam examples directory, and compile "hello.c":
mpicc -o hello hello.c
cp hello /mnt/wolf |
Then, as shown in the lam documentation, start up lam:
[wolf@wolf00 wolf]$ lamboot -v lamhosts
LAM 7.0/MPI 2 C++/ROMIO - Indiana University
n0<2572> ssi:boot:base:linear: booting n0 (wolf00)
n0<2572> ssi:boot:base:linear: booting n1 (wolf01)
n0<2572> ssi:boot:base:linear: booting n2 (wolf02)
n0<2572> ssi:boot:base:linear: booting n3 (wolf04)
n0<2572> ssi:boot:base:linear: finished |
So we are now finally ready to run an app. [Remember, I am using
lam; your message passing interface may have different syntax].
[wolf@wolf00 wolf]$ mpirun n0-3 /mnt/wolf/hello
Hello, world! I am 0 of 4
Hello, world! I am 3 of 4
Hello, world! I am 2 of 4
Hello, world! I am 1 of 4
[wolf@wolf00 wolf]$ |
Recall I mentioned the use of NFS above. I am telling the nodes to
all use the nfs shared directory, which will bottleneck when using a
larger number of boxes. You could easily copy the executable to each box,
and in the mpirun command, specify node local directories: mpirun n0-3
/home/wolf/hello. The prerequisite for this is to have all the files
available locally. In fact I have done this, and it worked better than
using the nfs shared executable. Of course this theory breaks down if my
cluster application needs to modify a file shared across the
cluster.