4.4. openMosix General Instructions

4.4.1. Kernel Compilation

Always use pure vanilla kernel-sources from http://www.kernel.org/ to compile an openMosix kernel! Please be kind enough to download the kernel using a mirror near to you and always try and download patches to the latest kernel sources you do have instead of downloading the whole thing. This is going to be much appreciated by the Linux community and will greatly increase your geeky Karma ;-) Be sure to use the right openMosix patch depending on the kernel-version. At the moment I write this, the latest 2.4 kernel is 2.4.20 so you should download the openMosix-2.4.20-x.gz patch, where the "x" stands for the patch revision (ie: the greater the revision number, the most recent it is). Do not use the kernel that comes with any Linux-distribution: it won't work. These kernel sources get heavily patched by the distribution-makers so, applying the openMosix patch to such a kernel is going to fail for sure! Been there, done that: trust me ;-)

Download the actual version of the openMosix patch and move it in your kernel-source directory (e.g. /usr/src/linux-2.4.20). If your kernel-source directory is other than "/usr/src/linux-[version_number]" at least the creation of a symbolic link to "/usr/src/linux-[version_number]" is required. Supposing you're the root user and you've downloaded the gzipped patch file in your home directory, apply the patch using (guess what?) the patch utility:
mv /root/openMosix-2.4.20-2.gz /usr/src/linux-2.4.20 cd /usr/src/linux-2.4.20 zcat openMosix-2.4.20-2.gz | patch -Np1
In the rare case you don't have "zcat" on your system, do:
mv /root/openMosix-2.4.20-2.gz /usr/src/linux-2.4.20 cd /usr/src/linux-2.4.20 gunzip openMosix-2.4.20-2.gz cat openMosix-2.4.20-2 | patch -Np1
If the even more weird case you don't have a "cat" on your system (!), do:
mv /root/openMosix-2.4.20-2.gz /usr/src/linux-2.4.20 cd /usr/src/linux-2.4.20 gunzip openMosix-2.4.20-2.gz patch -Np1 < openMosix-2.4.20-2
The "patch" command should now display a list of patched files from the kernel-sources. If you feel adventurous enough, enable the openMosix related options in the kernel-configuration file, e.g.
... CONFIG_MOSIX=y # CONFIG_MOSIX_TOPOLOGY is not set CONFIG_MOSIX_UDB=y # CONFIG_MOSIX_DEBUG is not set # CONFIG_MOSIX_CHEAT_MIGSELF is not set CONFIG_MOSIX_WEEEEEEEEE=y CONFIG_MOSIX_DIAG=y CONFIG_MOSIX_SECUREPORTS=y CONFIG_MOSIX_DISCLOSURE=3 CONFIG_QKERNEL_EXT=y CONFIG_MOSIX_DFSA=y CONFIG_MOSIX_FS=y CONFIG_MOSIX_PIPE_EXCEPTIONS=y CONFIG_QOS_JID=y ...
However, it's going to be pretty much easier if you configure the above options using one of the Linux-kernel configuration tools:
make config | menuconfig | xconfig
The above means you have to choose one of "config", "menuconfig", and "xconfig". It's a matter of taste. By the way, "config" is going to work on any system; "menuconfig" needs the curses libraries installed while "xconfig" needs an installed X-window environment plus the TCL/TK libraries and interpreters.

Now compile it with:
make dep bzImage modules modules_install
After compilation install the new kernel with the openMosix options within you boot-loader; e.g. insert an entry for the new kernel in /etc/lilo.conf and run lilo after that.

Reboot and your openMosix-cluster-node is up!

4.4.2. Syntax of the /etc/openmosix.map file

Before starting openMosix, there has to be an /etc/openmosix.map configuration file which must be the same on each node.

The standard is now /etc/openmosix.map, /etc/mosix.map and /etc/hpc.map are old standards, but the CVS-version of the tools is backwards compatible and looks for /etc/openmosix.map, /etc/mosix.map and /etc/hpc.map (in that order).

The openmosix.map file contains three space separated fields:
openMosix-Node_ID IP-Address(or hostname) Range-size
An example openmosix.map file could look like this:
1 node1 1 2 node2 1 3 node3 1 4 node4 1
or
1 192.168.1.1 1 2 192.168.1.2 1 3 192.168.1.3 1 4 192.168.1.4 1
or with the help of the range-size both of the above examples equal to:
1 192.168.1.1 4
openMosix "counts-up" the last byte of the ip-address of the node according to its openMosix-Node_ID. Of course, if you use a range-size greater than 1 you have to use ip-addresses instead of hostnames.

If a node has more than one network-interface it can be configured with the ALIAS option in the range-size field (which equals to setting the range-size to 0) e.g.
1 192.168.1.1 1 2 192.168.1.2 1 3 192.168.1.3 1 4 192.168.1.4 1 4 192.168.10.10 ALIAS
Here the node with the openMosix-Node_ID 4 has two network-interfaces (192.168.1.4 + 192.168.10.10) which are both visible to openMosix.

Always be sure to run the same openMosix version AND configuration on each of your Cluster's nodes!

Start openMosix with the "setpe" utility on each node :
setpe -w -f /etc/openmosix.map
Execute this command (which will be described later on in this HOWTO) on every node in your openMosix cluster.

Alternatively, you can grab the "openmosix" script which can be found in the scripts directory of the userspace-tools, copy it to the /etc/init.d directory, chmod 0755 it, then use the following commands as root:
/etc/init.d/openmosix stop /etc/init.d/openmosix start /etc/init.d/openmosix restart

Installation is finished now: the cluster is up and running :)

4.4.3. oMFS

First of all, the CONFIG_MOSIX_FS option in the kernel configuration has to be enabled. If the current kernel was compiled without this option, then recompilation with this option enabled is required.

Also the UIDs (User IDs) and GIDs (Group IDs) on each of the clusters' nodes file-systems must be the same. You might want to accomplish this using openldap. The CONFIG_MOSIX_DFSA option in the kernel is optional but of course required if DFSA should be used. To mount oMFS on the cluster there has to be an additional fstab-entry on each node's /etc/fstab.

in order to have DFSA enabled:
mfs_mnt /mfs mfs dfsa=1 0 0
in order to have DFSA disabled:
mfs_mnt /mfs mfs dfsa=0 0 0
the syntax of this fstab-entry is:
[device_name] [mount_point] mfs defaults 0 0
After mounting the /mfs mount-point on each node, each node's file-system is going to be accessible through the /mfs/[openMosix-Node_ID]/ directories.

With the help of some symbolic links all cluster-nodes can access the same data e.g. /work on node1
on node2 : ln -s /mfs/1/work /work on node3 : ln -s /mfs/1/work /work on node3 : ln -s /mfs/1/work /work ...
Now every node can read+write from and to /work !

The following special files are excluded from the oMFS:

the /proc directory
special files which are not regular-files, directories or symbolic links (e.g. /dev/hda1)

Creating links like:
ln -s /mfs/1/mfs/1/usr
or
ln -s /mfs/1/mfs/3/usr
is invalid.

The following system calls are supported without sending the migrated process (which executes this call on its home (remote) node) going back to its home node:

read, readv, write, writev, readahead, lseek, llseek, open, creat, close, dup, dup2, fcntl/fcntl64, getdents, getdents64, old_readdir, fsync, fdatasync, chdir, fchdir, getcwd, stat, stat64, newstat, lstat, lstat64, newlstat, fstat, fstat64, newfstat, access, truncate, truncate64, ftruncate, ftruncate64, chmod, chown, chown16, lchown, lchown16, fchmod, fchown, fchown16, utime, utimes, symlink, readlink, mkdir, rmdir, link, unlink, rename

Here are situations when system calls on DFSA mounted file-systems may not work:

different mfs/dfsa configuration on the cluster-nodes
dup2 if the second file-pointer is non-DFSA
chdir/fchdir if the parent dir is non-DFSA
pathnames that leave the DFSA-filesystem
when the process which executes the system-call is being traced
if there are pending requests for the process which executes the system-call

Next to the /mfs/1/ /mfs/2/ and so on files you will find some other directories as well.

Table 4-1. Other Directories

/mfs/here	The current node where your process runs
/mfs/home	Your home node
/mfs/magic	The current node when used by the "creat" system call (or an "open" with the "O_CREAT" option) - otherwise, the last node on which an oMFS magical file was successfully created (this is very useful for creating temporary-files, then immediately unlinking them)
/mfs/lastexec	The node on which the process last issued a successful "execve" system-call.
/mfs/selected	The node you selected by either your process itself or one of its ancestor's (before forking this process), writing a number into "/proc/self/selected".

Note that these magic files are all ``per process''. That is their content is dependent upon which process opens them.

A last not about openMFS is that there are versions around that return faultive results when you run "df" on those filesystems. Don't be surpised if you suddenlty have about 1.3 TB available on those systems.