Thursday, March 8, 2012

Raid 5, mdadm, ubuntu: lessons learnt

This article documents my experiences setting up a raid 5 array using Ubuntu and mdadm. This article is mainly for my own reference. If you decide to follow any of the steps here in, the risk is all yours etc etc bla bla bla.


Tearing down an array 

While setting up an array and performing experiments you will, no doubt, have a need to tear one down at some stage.

"superblock" information on a drive flags to the OS / mdadm that  a disk is part of a raid array.  to tear down an array you need to halt and remove your array and then delete superblock information on all raid member drives.

execute the following:

mdadm --stop /dev/md0  # to halt the array
mdadm --remove /dev/md0  # to remove the array
mdadm --zero-superblock /dev/sd[abc]1  # delete the superblock from all drives in the array
(edit /etc/mdadm/mdamd.conf to delete any rows related to deleted array)

Note: I have found attempting to tear down an array while the OS is running problematic.  The easiest way is to do it is from the live CD.


Preparing  Advanced Format Drives (Western Digital Green 2TB) for a raid array 


Advanced format drives have a different sector size to previous generation disks.  As a result the primary partition on these disks need to start at a sector that is divisible by 8. I initially configured my array to start at sector 64 but I got a misalignment error: Go figure.  After several attempts, I successfully configured my drives to start at sector 2048.

here are the commands I used to partition my drives and prepare them for my raid array:

$ sudo parted -a optimal /dev/[disk]
(parted) mklabel gpt #creates a GUID partition table
(parted) u s #makes units sectors
(parted) mkpart primary ext4 2048 -1 #Makes a new partition starting on sector 2048 and ending as close to the end of the hard drive as possible. If it tells you it can’t use the last sector and offers another value, just accept that.cat /proc/mdstat
(parted) quit
#Note that you may type the letter p at any time to print the current details of the drive.


Create a raid 5 array using mdadm

this is the easiest part:

sudo mdadm --create /dev/md0 --level=5 --raid-devices=5 --chunk=64 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1

a few things to note here:

 /dev/md0 is the device path to your raid array. you should be able to find this info in Disk Utility if you array has been set up properly

 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1: are all the member drives that make up your array.  again you can find these drives in Disk Utility program.
cat /proc/mdstat cat /proc/mdstat cat /proc/mdstat cat /proc/mdstat cat /proc/mdstat
Chunk: is the size of the stripes on your array.  As I understand it, if your array is going to store large files then you probably want a largish chunk size; for small files you need a small chunk size. for video  for example: 512 kib or above is recommended.

the other options should be self evident

Depending on the number of disks and the size of the array, the array will take quite a while to build.  I have 5 2TB drives and it took me around 6 hours.


Format your array

After creating the array it needs to be formatted. This took about 30 minutes for me.

To do this go into Disk Utility, clicked on my raid array and then clicked on Format Volume.

On my first attempt I formatted my array using ext4. after mounting the volume I found however that jbd2, an ext4 journalling proof lines there that look like this:

cess, was constantly writing to my drives. jbd2 should write to disk once every 8 seconds (i think).  but of-course with 6 disks (including the OS) the amount of IO is ridiculous.  so I decided to go with ext3 instead and hey presto problem solved.

There may be a way round this,  you may be able to configure ext4 to play nice.  but hey who has the time.

at this stage you can mount your drive and start using it.  however in order for the array to be mounted and usable after booting you need to add an entry to fstab and to mdadm.conf


Edit fstab

open fstab as root. from the terminal type:


sudo gedit /etc/fstab

Add the following line to the bottom of the file.

UUID=yourraiddeviceuuid /mnt/yourstoragefolder ext3    defaults        0       2

note: you will need to replace yourraiddeviceuuid withyour array's uuid.  you can find this with the blkid command.  you also need to create a folder under /mnt/ with appropriate permissions.  /mnt/yourstoragefolder will need to point to this directory.  There are plenty of guides on mounting devices so I wont go into details here

save and close the text editor.


Edit mdadm.conf

in the terminal , type:

sudo mdadm --detail --scan

this will print information abof lines there that look like this:

out your array. copy the text that is produced. in my case this was:

ARRAY /dev/md0 metadata=1.2 name=NAS:0 UUID=150099eb:37d941df:fa50f43b:c8d6903b

open mdadm.conf. type:

sudo gedit /etc/mdadm/mdadm.conf


paste the text copied above under the heading "DEVICE partitions".

save and close the text editor.

thats it you should now have a raid array that is configured to start and mount at boot

Testing the array

Initially I tested the array by pulling the data cable from one of my hard drives and attempting to boot. I was unable to boot as a result.  Every time i attempted to boot i ended up at a console (busybox). It appears that the OS does not allow you to boot if you have a degraded array, even if that array is only used for storage and does not have a bootable partition. Go figure.

I eventually got around this problem by commenting GRUB_CMDLINE_LINUX="" in grub and adding the line below


#GRUB_CMDLINE_LINUX=""
GRUB_CMDLINE_LINUX="bootdegraded=true"


after editing grub restart it with


sudo update-grub


Ok so, on to testing the array.  first you need to fail the array. 2 ways to do this

1) pull the data cable on the hard drive. If you have modified your grub as above you should be able to simply pull the data cable on a hard drive.  (Note i did not test this successfully. after my initial problems i ended up resetting my entire array).

2) set the drive to failed using mdadm:


mdadm --manage /dev/md0 --fail /dev/yourdriveid


after you fail the drive you need to remove it with


mdadm --manage /dev/md0 --remove /dev/yourdriveid


note: I did not need to remove the drive. i restarted my pc straight after failing it.  after restarting, the drive had already been removedlater

finally you need to add the new drive.  (for my test i simply added the same drive back to the array).


adm --manage /dev/md0 --add /dev/yourdriveid



7 months Later

Just found a minor issue.  the device name of my array, md0, has mysteriously changed to md127. this does not appear to have had any impact on the working of the array but is still not ideal. after googling for many hours i could not find any reliable information about why this might have occurred. It  is particularly mysterious when you consider that i have had auto update turned of on my NAS.  

Anywho 1 solution which worked for me was to remove the name parameter in this line 

ARRAY /dev/md0 metadata=1.2 name=NAS:0 UUID=150099eb:37d941df:fa50f43b:c8d6903b

in mdadm.conf. Additionaly, i had to run 
later
sudo update-initramfs -u

Another wonderful Ubuntu mystery :-\
of lines there that look like this:

More that a year Later

1) I am converting my raid 5 to a raid 10. below are the steps 1 followed.
I copied all content to another drive so I could blow away my raid 5 array.

2) Tear down the array see "Tearing down array above"

3) Repartition the drives. See "Preparing  Advanced Format Drives (Western Digital Green 2TB) for a raid array" above. In addition to following all these steps I also added the following at the end.

(parted) 
set 1 raid on 

Note: I later unset this flag.  As far as I can see it makes no difference to the operation of the array

4) create the raid 10 array with:

sudo mdadm -v --create /dev/md0 --level=raid10 --chunk=64 --raid-devices=6 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1

Note: I probably should have given the array a bugger chunk size.

5) Monitor the creation with:

watch cat /proc/mdstat

6) I added

ARRAY /dev/md0 UUID=d94c27f0:d4e3ab95:e5dd0caa:ea902d29

to /etc/mdadm/mdadm.conf

7 ) I then updated initramfs with

 sudo update-initramfs -u

8) I added

/dev/md0p1 /mnt/storage ext3    defaults        0       2

to /etc/fstab