4.4.1. Kernel Compilation
Always use pure vanilla kernel-sources from http://www.kernel.org/ to compile an
openMosix kernel! Please be kind enough to download the kernel using a
mirror near to you and always try and download patches to the latest
kernel sources you do have instead of downloading the whole
thing. This is going to be much appreciated by the Linux community and
will greatly increase your geeky Karma ;-)
Be sure to use the right openMosix patch depending on the
kernel-version. At the moment I write this, the latest 2.4 kernel is
2.4.20 so you should download the openMosix-2.4.20-x.gz patch, where
the "x" stands for the patch revision (ie: the greater the revision
number, the most recent it is).
Do not use the kernel that comes with any Linux-distribution: it won't
work. These kernel sources get heavily patched by the
distribution-makers so, applying the openMosix patch to such a kernel
is going to fail for sure! Been there, done that: trust me ;-)
Download the actual version of the openMosix patch and move it in your
kernel-source directory (e.g. /usr/src/linux-2.4.20). If your
kernel-source directory is other than
"/usr/src/linux-[version_number]" at least the creation of a symbolic
link to "/usr/src/linux-[version_number]" is required.
Supposing you're the root user and you've downloaded the gzipped patch
file in your home directory, apply the patch using (guess what?) the
patch utility:
mv /root/openMosix-2.4.20-2.gz /usr/src/linux-2.4.20
cd /usr/src/linux-2.4.20
zcat openMosix-2.4.20-2.gz | patch -Np1 |
In the rare case you don't have "zcat" on your system, do:
mv /root/openMosix-2.4.20-2.gz /usr/src/linux-2.4.20
cd /usr/src/linux-2.4.20
gunzip openMosix-2.4.20-2.gz
cat openMosix-2.4.20-2 | patch -Np1 |
If the even more weird case you don't have a "cat" on your system (!),
do:
mv /root/openMosix-2.4.20-2.gz /usr/src/linux-2.4.20
cd /usr/src/linux-2.4.20
gunzip openMosix-2.4.20-2.gz
patch -Np1 < openMosix-2.4.20-2 |
The "patch" command should now display a list of patched files from
the kernel-sources.
If you feel adventurous enough, enable the openMosix related options
in the kernel-configuration file, e.g.
...
CONFIG_MOSIX=y
# CONFIG_MOSIX_TOPOLOGY is not set
CONFIG_MOSIX_UDB=y
# CONFIG_MOSIX_DEBUG is not set
# CONFIG_MOSIX_CHEAT_MIGSELF is not set
CONFIG_MOSIX_WEEEEEEEEE=y
CONFIG_MOSIX_DIAG=y
CONFIG_MOSIX_SECUREPORTS=y
CONFIG_MOSIX_DISCLOSURE=3
CONFIG_QKERNEL_EXT=y
CONFIG_MOSIX_DFSA=y
CONFIG_MOSIX_FS=y
CONFIG_MOSIX_PIPE_EXCEPTIONS=y
CONFIG_QOS_JID=y
... |
However, it's going to be pretty much easier if you configure the
above options using one of the Linux-kernel configuration tools:
make config | menuconfig | xconfig |
The above means you have to choose one of "config", "menuconfig", and
"xconfig". It's a matter of taste. By the way, "config" is going to
work on any system; "menuconfig" needs the curses libraries installed
while "xconfig" needs an installed X-window environment plus the
TCL/TK libraries and interpreters.
Now compile it with:
make dep bzImage modules modules_install |
After compilation install the new kernel with the openMosix options
within you boot-loader; e.g. insert an entry for the new kernel in
/etc/lilo.conf and run lilo after that.
Reboot and your openMosix-cluster-node is up!
4.4.2. Syntax of the /etc/openmosix.map file
Before starting openMosix, there has to be an /etc/openmosix.map
configuration file which must be the same on each node.
The standard is now /etc/openmosix.map, /etc/mosix.map and /etc/hpc.map are old
standards, but the CVS-version of the
tools is backwards compatible and looks for /etc/openmosix.map,
/etc/mosix.map and /etc/hpc.map (in that order).
The
openmosix.map file contains three space separated fields:
openMosix-Node_ID IP-Address(or hostname) Range-size |
An example openmosix.map file could look like this:
1 node1 1
2 node2 1
3 node3 1
4 node4 1 |
or
1 192.168.1.1 1
2 192.168.1.2 1
3 192.168.1.3 1
4 192.168.1.4 1 |
or with the help of the range-size both of the above examples equal to:
openMosix "counts-up" the last byte of the ip-address of the node
according to its openMosix-Node_ID. Of course, if you use a
range-size greater than 1 you have to use ip-addresses instead of
hostnames.
If a node has more than one network-interface it can be configured
with the ALIAS option in the range-size field (which equals to setting
the range-size to 0) e.g.
1 192.168.1.1 1
2 192.168.1.2 1
3 192.168.1.3 1
4 192.168.1.4 1
4 192.168.10.10 ALIAS |
Here the node with the openMosix-Node_ID 4 has two network-interfaces
(192.168.1.4 + 192.168.10.10) which are both visible to openMosix.
Always be sure to run the same openMosix version AND configuration on each
of your Cluster's nodes!
Start openMosix with the "setpe" utility on each node :
setpe -w -f /etc/openmosix.map |
Execute this command (which will be described later on in this HOWTO)
on every node in your openMosix cluster.
Alternatively, you can grab the "openmosix" script which can be found
in the scripts directory of the userspace-tools, copy it to the
/etc/init.d directory, chmod 0755 it, then use the following commands
as root:
/etc/init.d/openmosix stop
/etc/init.d/openmosix start
/etc/init.d/openmosix restart |
Installation is finished now: the cluster is up and running :)
4.4.3. oMFS
First of all, the CONFIG_MOSIX_FS option in the kernel configuration
has to be enabled. If the current kernel was compiled without this
option, then recompilation with this option enabled is required.
Also
the UIDs (User IDs) and GIDs (Group IDs) on each of the clusters'
nodes file-systems must be the same. You might want to accomplish this using
openldap. The CONFIG_MOSIX_DFSA option in
the kernel is optional but of course required if DFSA should be used.
To mount oMFS on the cluster there has to be an additional fstab-entry
on each node's /etc/fstab.
in order to have DFSA enabled:
mfs_mnt /mfs mfs dfsa=1 0 0 |
in order to have DFSA disabled:
mfs_mnt /mfs mfs dfsa=0 0 0 |
the syntax of this fstab-entry is:
[device_name] [mount_point] mfs defaults 0 0 |
After mounting the /mfs mount-point on each node, each node's
file-system is going to be accessible through the
/mfs/[openMosix-Node_ID]/ directories.
With the help of some symbolic links all cluster-nodes can access the same
data e.g. /work on node1
on node2 : ln -s /mfs/1/work /work
on node3 : ln -s /mfs/1/work /work
on node3 : ln -s /mfs/1/work /work
... |
Now every node can read+write from and to /work !
The following special files are excluded from the oMFS:
Creating links like:
or
is invalid.
The following system calls are supported without sending the migrated
process (which executes this call on its home (remote) node) going
back to its home node:
read, readv, write, writev, readahead, lseek, llseek, open, creat,
close, dup, dup2, fcntl/fcntl64, getdents, getdents64, old_readdir,
fsync, fdatasync, chdir, fchdir, getcwd, stat, stat64, newstat, lstat,
lstat64, newlstat, fstat, fstat64, newfstat, access, truncate,
truncate64, ftruncate, ftruncate64, chmod, chown, chown16, lchown,
lchown16, fchmod, fchown, fchown16, utime, utimes, symlink, readlink,
mkdir, rmdir, link, unlink, rename
Here are situations when system calls on DFSA mounted file-systems may not
work:
different mfs/dfsa configuration on the
cluster-nodes
dup2 if the second file-pointer is
non-DFSA
chdir/fchdir if the parent dir is
non-DFSA
pathnames that leave the
DFSA-filesystem
when the process which executes the system-call is
being traced
if there are pending requests for the process which
executes the system-call
Next to the /mfs/1/ /mfs/2/ and so on files you will find some other
directories as well.
Table 4-1. Other Directories
/mfs/here | The current node where your process runs |
/mfs/home | Your home node |
/mfs/magic | The current node when used by the "creat" system call (or
an "open" with the "O_CREAT" option) - otherwise, the last node on
which an oMFS magical file was successfully created (this is very
useful for creating temporary-files, then immediately unlinking
them)
|
/mfs/lastexec | The node on which the process last issued a successful
"execve" system-call.
|
/mfs/selected | The node you selected by either your process itself or one
of its ancestor's (before forking this process), writing a number
into "/proc/self/selected".
|
Note that these magic files are all ``per process''. That is their
content is dependent upon which process opens them.
A last not about openMFS is that there are versions around that return
faultive results when you run "df" on those filesystems.
Don't be surpised if you suddenlty have about 1.3 TB available on those
systems.