Next
Previous
Contents
Software RAID devices are so-called "block" devices, like ordinary
disks or disk partitions. A RAID device is "built" from a number of
other block devices - for example, a RAID-1 could be built from two
ordinary disks, or from two disk partitions (on separate disks -
please see the description of RAID-1 for details on this).
There are no other special requirements to the devices from which you
build your RAID devices - this gives you a lot of freedom in designing
your RAID solution. For example, you can build a RAID from a mix of
IDE and SCSI devices, and you can even build a RAID from other RAID
devices (this is useful for RAID-0+1, where you simply construct two
RAID-1 devices from ordinary disks, and finally construct a RAID-0
device from those two RAID-1 devices).
Therefore, in the following text, we will use the word "device" as
meaning "disk", "partition", or even "RAID device". A "device" in the
following text simply refers to a "Linux block device". It could be
anything from a SCSI disk to a network block device. We will commonly
refer to these "devices" simply as "disks", because that is what they
will be in the common case.
However, there are several roles that devices can play in your
arrays. A device could be a "spare disk", it could have failed and
thus be a "faulty disk", or it could be a normally working and fully
functional device actively used by the array.
In the following we describe two special types of devices; namely the
"spare disks" and the "faulty disks".
Spare disks are disks that do not take part in the RAID set until one
of the active disks fail. When a device failure is detected, that
device is marked as "bad" and reconstruction is immediately started
on the first spare-disk available.
Thus, spare disks add a nice extra safety to especially RAID-5 systems
that perhaps are hard to get to (physically). One can allow the system
to run for some time, with a faulty device, since all redundancy is
preserved by means of the spare disk.
You cannot be sure that your system will keep running after a disk
crash though. The RAID layer should handle device failures just fine,
but SCSI drivers could be broken on error handling, or the IDE chipset
could lock up, or a lot of other things could happen.
Also, once reconstruction to a hot-spare begins, the RAID layer will
start reading from all the other disks to re-create the redundant
information. If multiple disks have built up bad blocks over time, the
reconstruction itself can actually trigger a failure on one of the
"good" disks. This will lead to a complete RAID failure. If you do
frequent backups of the entire filesystem on the RAID array, then it
is highly unlikely that you would ever get in this situation - this is
another very good reason for taking frequent backups. Remember, RAID
is not a substitute for backups.
When the RAID layer handles device failures just fine, crashed disks
are marked as faulty, and reconstruction is immediately started
on the first spare-disk available.
Faulty disks still appear and behave as members of the array. The RAID
layer just treats crashed devices as inactive parts of the filesystem.
Next
Previous
Contents
|