Raid (re)build progress

This is about the Linux md software Raid system. If you have a hardware Raid controller, ask the vendor how to find out what it's doing.

Software Raid

I've been using Linux software Raid for many many years - I think it's a splendid feature to have as default in every kernel, and in fact there are few machines which I use without some sort of Raid involved.

Hard disks die. So do SSDs. So, for that matter, do CPUs and PSUs, but they don't contain any data which could make you cry, lose you your job, or destroy your business (and possibly all three), if you lost it.

So, don't ever have a single copy of anything you care about on a single disk (or other storage device).

My approach is to use Raid (level 5, mostly) on the machine I store my data on, and then I have two other machines, also using Raid, and in different places, which have backups of that data. Some people might consider that over the top, but it's my data, and I know how difficult my life would be if I lost it. Your data may vary.

I've been using HP Microservers (N36L and N54L) for several years, and these have four hot-swappable (although HP doesn't like admitting to that) 3.5" SATA slots in them, so in these I install four disks using Raid 5, get the capacity of three, and any one of them can fail without it being an immediate problem.

More recently, I've started using some Fantec 8-drive external USB3 cabinets, and with this number of drives I tend to be more cautious / paranoid, and use Raid 6 (which means that up to two disks can fail without losing any data). I also add a hot spare (which means that a disk failure is automatically reacted to, and the failed disk replaced with the spare, reducing the time in which the array is not up to full redundancy). This means that out of 8 disks in one cabinet, I get the capacity of 5, but I regard the cost of the extra 3 disks as being significantly less than the value of the data I'm storing.

The main thing I do not like about the Fantec cabinets is that they will not turn on when power is restored after a failure, so connecting them to a server and powering the whole lot from a UPS is better than not using a UPS, but if the mains fails for long enough that the UPS tells the server to shut down, and then removes the power from it, the Fantec cabinet will not start up again when the server does. This is bad. If I can work out how to get the control panel PCB out of the housing, I intend to see whether an electrolytic capacitor across the power switch terminals can work around this design defect.

Backups

I commented above that I have machines (running Raid) with my data on them, and other machines with backups of that data.

Note that Raid is not a backup. The two are quite different, serve different purposes, and are not interchangeable.

Raid will protect you from physical disk failures, but it will probably not protect you against file system corruption if your computer suddenly loses power.

Similarly, a Raid array will not protect your data from your own stupidity. When you delete or over-write a valuable file, that's it - the file system did what you told it, and running with Raid underneath makes no difference - your data is gone.

Backups are a very good idea, but taking daily (or more likely nightly) backups of your data helps not in the slightest if the (non-Raid) disk you put the data on goes bad between when you wrote the data and when the backup was going to run.

Note that backups will also not protect you from your own stupidity - depending on when you are stupid, and when you realise it. Assuming you run nightly backups, if you create or update an important file some day, and then delete or over-write it the day after or later, then you can recover that file from backup provided you realise on the same day that you destroyed it. If you wait until the day after your mistake, the backup will also be destroyed, and you've still lost your data.

For this reason (I like to think I'm not as stupid as the average person, but I still know that I am stupid on occasion), I use backups and archives.

Archives are snapshots of your data at some point in time, and never get over-written, so you can always go back to the way some file was in the past, and recover it.

Of course, this costs extra disk space, but not as much as you might think. The Linux EXT file systems (and others, but those are the ones I use) have the concept of "hard links" which are pointers from a directory entry to a file on disk, but which take up the space of the directory entry (basically, the filename, plus the datestamp and the pointer itself - just a few bytes) without taking up any extra space for multiple copies of the file. You can have as many hard links to one file as you like, but there is only one copy of the file taking up space on the disk. The rsync command quite happily understands hard links, and will create multiple copies of your data, which together only take up the combined space of the unique files (so, ten days' backups of 100 files, where you changed 5 of those files during the ten days, would in total take up the space of 105 files, not 1000, for example).

Summary: Raid is a good idea, backups are another good idea. They do different things, so use both. Archives are better than backups, so use those too if you can.

One day you, too, will be stupid, and will be glad that you weren't stupid when you planned your data recovery process.

(Re)build time

With modern multi-terabyte drives, building a Raid array, or replacing a disk in a working array, can take quite some time (from hours to days, depending on the sizes of your disks, the speed of the interface they're connected to, and the Raid level you're building).

The pseudo-file /proc/mdstat tells you about the status of your Raid arrays, including how fast a (re)build is happening, and how long it's currently predicted to take to complete.

Here's an example:

md5 : active raid5 sda2[1] sde2[0] sdd2[5] sdb2[3] sdc2[2]
      1835409408 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
      [=======>.............]  resync = 39.4% (723151308/1835409412) finish=104.3min speed=17900K/sec
      bitmap: 2/4 pages [8KB], 65536KB chunk

This tells me that I have a Raid 5 array named /dev/md5 comprising 5 disk partitions (/dev/sda2 to /dev/sde2), and it is currently 39.4% through a "resyncing" process, which is running at 17.9 Mbytes/sec and is therefore predicted to finish in 104 minutes' time.

However, I quite like knowing when the process will finish, without having to do the mental arithmetic involved in working out "when is 104 minutes from now?" That may not be too hard an example, but I built another rather larger array recently, and that is reporting 4616 minutes to go at the 38% stage. Adding 4616 minutes to "now" is not trivial :)

So, I use the following bash script (which can, of course, be run on one line, with the addition of a few semi-colons) to get a continuous update on progress and completion time:

while grep -g finish /proc/mdstat
do
  echo `date; grep finish /proc/mdstat; date -d +\`grep finish /proc/mdstat | sed 's/.*finish=\([0-9]*\).*/\1/'\`minutes`
  sleep 5000
done

That gives me an output like:

Sat 29 Jan 14:36:40 CET 2022 [=======>.............] resync = 38.4% (3000743688/7813894144) finish=4615.9min speed=17378K/sec Tue 1 Feb 19:31:40 CET 2022

every 5000 seconds (just under an hour and a half). I adjust the number 5000 for different arrays depending on how long the whole process is going to take - about 1% progress per new line of output is about right for me.

No need to wait

One of the impressive features of Raid arrays is that you do not need to wait for them to be fully synced before you can start using them.

You can create a brand-new Raid 5 (or 6) array from fresh disks, and then, whilst it is still doing its first synchronisation:

  • format it as a partition and mount it as a file system
  • prepare it as an LVM Physical Volume, create a Volume Group on it, and then create Logical Volumes (which can in turn then be formatted and mounted)
  • declare it as an iSCSI share and make it available to a remote computer as a networked block device
  • etc.

The only downside is that if you start writing data to it while it is still synchronising, both operations will be slower than if you waited for the sync to complete and then wrote the data to it, however this may be entirely acceptable if you simply want to get on with putting data on the disks and are not too bothered about how soon the Raid sync will be complete.

Interruptible

Another impressive feature of Linux software Raid is that a (re)build can be interrupted and resumed later, without having to start again from the beginning.

You can stop the process, disable the array, or reboot the machine, and when you put the array back together again, the (re)build will simply carry on from where it left off.


Go up
Return to main index.