When you look at a hard disk under Unix, you see a tree of named
directories and files. Normally you won't need to look any deeper than
that, but it does become useful to know what's going on underneath if you
have a disk crash and need to try to salvage files. Unfortunately, there's
no good way to describe disk organization from the file level downwards, so
I'll have to describe it from the hardware up.
10.3. Mount points
In the simplest case, your entire Unix file system lives in just one
disk partition. While you'll see this arrangement on some small personal
Unix systems, it's unusual. More typical is for it to be spread across
several disk partitions, possibly on different physical disks. So, for
example, your system may have one small partition where the kernel lives, a
slightly larger one where OS utilities live, and a much bigger one where
user home directories live.
The only partition you'll have access to immediately after system
boot is your root partition,
which is (almost always) the one you booted from. It holds the root
directory of the file system, the top node from which everything else
hangs.
The other partitions in the system have to be attached to this root
in order for your entire, multiple-partition file system to be accessible.
About midway through the boot process, your Unix will make these non-root
partitions accessible. It will
mount
each one onto a directory on the root partition.
For example, if you have a Unix directory called
/usr, it is probably a mount point to a partition that
contains many programs installed with your Unix but not required during
initial boot.
10.5. File ownership, permissions and security
To keep programs from accidentally or
maliciously stepping on data they shouldn't, Unix has
permission
features. These were originally designed to support timesharing by
protecting multiple users on the same machine from each other, back in the
days when Unix ran mainly on expensive shared minicomputers.
In order to understand file permissions, you need to recall the
description of users and groups in the section
What happens when you log in?. Each file has an owning user and an
owning group. These are initially those of the file's creator; they can be
changed with the programs
chown(1) and
chgrp(1).
The basic permissions that can be associated with a file are
‘read’ (permission to read data from it), ‘write’
(permission to modify it) and ‘execute’ (permission to run it
as a program). Each file has three sets of permissions; one for its owning
user, one for any user in its owning group, and one for everyone else. The
‘privileges’ you get when you log in are just the ability to do
read, write, and execute on those files for which the permission bits match
your user ID or one of the groups you are in, or files that have been made
accessible to the world.
To see how these may interact and how Unix displays them, let's look
at some file listings on a hypothetical Unix system. Here's one:
snark:~$ ls -l notes
-rw-r--r-- 1 esr users 2993 Jun 17 11:00 notes
|
This is an ordinary data file. The listing tells us that it's owned
by the user ‘esr’ and was created with the owning group
‘users’. Probably the machine we're on puts every ordinary user in
this group by default; other groups you commonly see on timesharing
machines are ‘staff’, ‘admin’, or
‘wheel’ (for obvious reasons, groups are not very important on
single-user workstations or PCs). Your Unix may use a different default
group, perhaps one named after your user ID.
The string ‘-rw-r--r--’ represents the permission bits
for the file. The very first dash is the position for the directory bit;
it would show ‘d’ if the file were a directory, or would show
‘l’ if the file were a symbolic link. After that, the first
three places are user permissions, the second three group permissions, and
the third are permissions for others (often called ‘world’
permissions). On this file, the owning user ‘esr’ may read or
write the file, other people in the ‘users’ group may read it,
and everybody else in the world may read it. This is a pretty typical set
of permissions for an ordinary data file.
Now let's look at a file with very different permissions. This file
is GCC, the GNU C compiler.
snark:~$ ls -l /usr/bin/gcc
-rwxr-xr-x 3 root bin 64796 Mar 21 16:41 /usr/bin/gcc
|
This file belongs to a user called ‘root’ and a group
called ‘bin’; it can be written (modified) only by root, but
read or executed by anyone. This is a typical ownership and set of
permissions for a pre-installed system command. The ‘bin’
group exists on some Unixes to group together system commands (the name is
a historical relic, short for ‘binary’). Your Unix might use a
‘root’ group instead (not quite the same as the ‘root'
user!).
The ‘root’ user is the conventional name for numeric user
ID 0, a special, privileged account that can override all privileges. Root
access is useful but dangerous; a typing mistake while you're logged in as
root can clobber critical system files that the same command executed from
an ordinary user account could not touch.
Because the root account is so powerful, access to it should be guarded
very carefully. Your root password is the single most critical piece of
security information on your system, and it is what any crackers and
intruders who ever come after you will be trying to get.
About passwords: Don't write them down — and don't pick a
passwords that can easily be guessed, like the first name of your
girlfriend/boyfriend/spouse. This is an astonishingly common bad practice
that helps crackers no end. In general, don't pick any word in the
dictionary; there are programs called dictionary
crackers that look for likely passwords by running through word
lists of common choices. A good technique is to pick a combination
consisting of a word, a digit, and another word, such as
‘shark6cider’ or ‘jump3joy’; that will make the search
space too large for a dictionary cracker. Don't use these examples, though
— crackers might expect that after reading this document and put them
in their dictionaries.
Now let's look at a third case:
snark:~$ ls -ld ~
drwxr-xr-x 89 esr users 9216 Jun 27 11:29 /home2/esr
snark:~$
|
This file is a directory (note the ‘d’ in the first
permissions slot). We see that it can be written only by esr, but read and
executed by anybody else.
Read permission gives you the ability to list the directory — that
is, to see the names of files and directories it contains. Write permission
gives you the ability to create and delete files in the directory. If you
remember that the directory includes a list of the names of the files and
subdirectories it contains, these rules will make sense.
Execute permission on a directory means you can get through the
directory to open the files and directories below it. In effect, it gives
you permission to access the i-nodes in the directory. A directory with
execute completely turned off would be useless.
Occasionally you'll see a directory that is world-executable but not
world-readable; this means a random user can get to files and directories
beneath it, but only by knowing their exact names (the directory cannot be
listed).
It's important to remember that read, write, or execute permission on a
directory is independent of the permissions on the files and directories
beneath. In particular, write access on a directory means you can
create new files or delete existing files there, but does not
automatically give you write access to existing files.
Finally, let's look at the permissions of the login program
itself.
snark:~$ ls -l /bin/login
-rwsr-xr-x 1 root bin 20164 Apr 17 12:57 /bin/login
|
This has the permissions we'd expect for a system command —
except for that ‘s’ where the owner-execute bit ought to be.
This is the visible manifestation of a special permission called the
‘set-user-id’ or setuid
bit.
The setuid bit is normally attached to programs that need to give
ordinary users the privileges of root, but in a controlled way. When it is
set on an executable program, you get the privileges of the owner of that
program file while the program is running on your behalf, whether or not
they match your own.
Like the root account itself, setuid programs are useful but
dangerous. Anyone who can subvert or modify a setuid program owned by root
can use it to spawn a shell with root privileges. For this reason, opening
a file to write it automatically turns off its setuid bit on most Unixes.
Many attacks on Unix security try to exploit bugs in setuid programs in
order to subvert them. Security-conscious system administrators are
therefore extra-careful about these programs and reluctant to install new
ones.
There are a couple of important details we glossed over when
discussing permissions above; namely, how the owning group and permissions
are assigned when a file or directory is first created. The group is an
issue because users can be members of multiple groups, but one of them
(specified in the user's /etc/passwd entry) is the
user's default group and will normally own files created by the
user.
The story with initial permission bits is a little more complicated.
A program that creates a file will normally specify the permissions it is
to start with. But these will be modified by a variable in the user's
environment called the
umask.
The umask specifies which permission bits to turn off
when creating a file; the most common value, and the default on most
systems, is -------w- or 002, which turns off the world-write bit. See the
documentation of the umask command on your shell's manual page for
details.
Initial directory group is also a bit complicated. On some Unixes a new
directory gets the default group of the creating user (this in the System V
convention); on others, it gets the owning group of the parent directory
in which it's created (this is the BSD convention). On some modern Unixes,
including Linux, the latter behavior can be selected by setting the
set-group-ID on the directory (chmod g+s).
10.6. How things can go wrong
Earlier it was hinted that file systems can be fragile things.
Now we know that to get to a file you have to hopscotch through what may be
an arbitrarily long chain of directory and i-node references. Now suppose
your hard disk develops a bad spot?
If you're lucky, it will only trash some file data. If you're
unlucky, it could corrupt a directory structure or i-node number and leave
an entire subtree of your system hanging in limbo — or, worse, result
in a corrupted structure that points multiple ways at the same disk block
or i-node. Such corruption can be spread by normal file operations,
trashing data that was not in the original bad spot.
Fortunately, this kind of contingency has become quite uncommon as disk
hardware has become more reliable. Still, it means that your Unix will
want to integrity-check the file system periodically to make sure nothing
is amiss. Modern Unixes do a fast integrity check on each partition at
boot time, just before mounting it. Every few reboots they'll do a much
more thorough check that takes a few minutes longer.
If all of this sounds like Unix is terribly complex and
failure-prone, it may be reassuring to know that these boot-time checks
typically catch and correct normal problems before
they become really disastrous. Other operating systems don't have these
facilities, which speeds up booting a bit but can leave you much more
seriously screwed when attempting to recover by hand (and that's assuming
you have a copy of Norton Utilities or whatever in the first
place...).
One of the trends in current Unix designs is journalling
file systems. These arrange traffic to the disk so that
it's guaranteed to be in a consistent state that can be recovered when the
system comes back up. This will speed up the boot-time integrity check a
lot.