7. Programming systems for GNU/Linux
This section deals with links to tutorials and documents for installing
Linux on a PC, getting started with Linux, and then going a step further
-- to optimize your PC for processing power, using multiple processors
(Symmetric Muliti Processing - SMP); making a cheap, upgradeable
Supercomputing Linux cluster and finally links to software to do parallel
programming on Linux.
7.2. Parallel Processing and Symmetric Multiprocessing:
Supercomputing
It is possible to get large volume number crunching without
spending millions of rupees on a supercomputer. You only need
to link together (by some high speed network) the requisite
number of CPUs, with GNU/LINUX as the underlying OS. Add
some freely available message passing software and a effective
parallel processing number crunching machine is made. Such
clusters are called "Beowulf clusters". The other advantages
of such a cluster other than building costs is, up-gradation
costs are minimal. The two best resources for Linux cluster
builders are
These sites are upgraded frequently with useful information
for cluster builders.
7.2.1. Parallel computing document links
You will also want to read this excellent article on
Linux
Clustering Software (and the large variety of links
it provides) by Joe Greenseid. I hope to go through the links
and include them subsequently in this HOWTO.
Other free document links for parallel processing are:
7.2.2. Parallel processing software for Linux
Now after reading the above documents, you have an idea of parallel
processing. Parallel program libraries are the core of parallel processing
on a Linux cluster. There are various free implementations of parallel
processing libraries. Since parallel processing is all about performance,
these libraries have some very nice functional tools to analyze your parallel
program performance. Given below is a set of links to these parallel
program libraries and tools.
Message Passing Interface:
MPI is a standard specification of message passing libraries. The above
document gives a lot of links to documents on the standard, etc.. A MPI
implementation for Linux
mpich is also
available at that site. There are a lot of documents for
Learning to use MPI
.
Local Area Multicomputer
- LAM:
LAM (Local Area Multicomputer) is an MPI programming environment and
development system for heterogeneous computers on a network.
With LAM, a dedicated cluster or an existing network computing
infrastructure can act as one parallel computer solving one problem.
LAM features extensive debugging support in the application development
cycle and peak performance for production applications. LAM features a
full implementation of the MPI communication standard.
You can download the sources (tar-zipped, rpm) or binaries from
here
A host of MPI tutorial links and also a `getting started with LAM'
tutorial is available
here
Parallel Virtual Machine :
As the PVM home page describes, it is a software package that permits
a heterogeneous collection of Unix and/or NT computers hooked together
by a network to be used as a single large parallel computer. Thus large
computational problems can be solved more cost effectively by using the
aggregate power and memory of many computers. The software is very
portable. The source, which is available free thru netlib, has been
compiled on everything from laptops to CRAYs.
Ganglia:
Ganglia is an open source cluster monitoring and execution environment
developed at the University of California, Berkeley Computer Science
Division. As the above link describes it, "Ganglia is as simple to
install and use on a 16-node cluster as it is to use on a 512-node
cluster as has been proven by its use on multiple 500+ node clusters".
It not only can link nodes in a cluster, but also link clusters to other
clusters.
|
|