Last modified: 2023-12-21 15:13
The installation processes of most Linux distributions now provide a way to set up a RAID array using the Linux kernel's multiple devices driver (md). After installation, during the boot process, the array gets assembled by the init script in the initrd image (the "initial ramdisk") before the root file system is mounted.
It is possible to put the root file system on md RAID without an initial ramdisk using a kernel feature known as RAID autodetect. When RAID autodetect is enabled, the kernel proactively and automatically assembles partitions of type fd (Linux raid autodetect) into a RAID array without being told to do so. The root file system is then already available at init time. Very convenient!
Unfortunately, in recent kernels, RAID autodetect has been deprecated and undermined to the point that it is no longer usable with RAID-0. It still works with linear RAID, also known as JBOD = Just a Bunch of Disks. This article shows how to use it with linear RAID and then explores how RAID-0 got broken.
RAID autodetect requires the relevant options to be built statically into the kernel, not as separate loadable modules. As of kernel 6.6.7, the relevant options appear under Device Drivers as follows:
"RAID 1/4/5/6/10 target" (CONFIG_DM_RAID) under "Device mapper support" (CONFIG_BLK_DEV_DM) is not necessary for RAID autodetect to work. However, you will need to build in everything needed to access your HDDs/SSDs (Serial ATA and Parallel ATA drivers, AHCI SATA support, ...).
GRUB (Grub2) is capable of pulling a kernel image out of an md RAID array except in the case of linear RAID, where it says "error: unsupported RAID level: -1." LILO is capable of pulling a kernel image out of an md RAID array only if it is RAID-1 or if you do something sketchy. But if you create a small non-RAID partition to hold the kernel and bootloader stuff that normally goes in /boot, either bootloader will work reliably.
Due to a special feature of the old superblock format that RAID autodetect requires, it is necessary to exclude the last 64 kB (128 sectors) of a device from every partition being RAIDed. The problem is that if the signature appears at the very end of the disk, it's ambiguous whether it is meant for the partition or the whole device.
Set the types of the partitions to be RAIDed to fd (Linux raid autodetect). The separate boot partition is plain old type 83 (Linux).
Use mdadm to create a non-partitioned array with the old 0.90 superblock format:
mdadm -C /dev/md0 -l linear -e 0.90 -n 2 /dev/sda2 /dev/sdb1
If mdadm prints errors about disks already in use, you can get rid of an existing autodetected array with mdadm -S /dev/md0 and then proceed as above.
After creating the array, you can test autodetection by booting an appropriate kernel and verifying that the newly created array is automatically assembled:
[ 2.707106] md: Waiting for all devices to be available before autodetect [ 2.709392] md: If you don't use raid, use raid=noautodetect [ 2.711606] md: Autodetecting RAID arrays. [ 2.715337] md: autorun ... [ 2.718248] md: running: <sdb1><sda2> [ 2.721132] md0: detected capacity change from 0 to 78596864 [ 2.723410] md: ... autorun DONE.
For a kernel running as guest in QEMU, I have found that the autodetection code hangs when -drive if=virtio but not when if=ide. This won't be fixed since autodetection is deprecated.
Ensure that the array has been assembled. Specify /dev/md0 (or whatever md number it decided to use) as the target for the root file system and format it. Format the small boot partition, label it boot, and mount it on /boot. Proceed to install as usual.
The bootloader needs to load the kernel from the non-RAIDed boot partition and tell the kernel to look on the md device for the root filesystem. Example grub.cfg:
menuentry "linux" { search --no-floppy --label --set=root boot linux /vmlinuz root=/dev/md0 }
GRUB will complain "unsupported RAID level: -1" when it is installed, but it won't matter since GRUB will not be accessing the RAID array.
Example lilo.conf:
boot = /dev/sda image = /boot/vmlinuz root = /dev/md0 label = linux read-only
# mdadm -C /dev/md0 -l 0 -e 0.90 -n 2 /dev/sda1 /dev/sdb1 mdadm: 0.90 metadata does not support layouts for RAID0
The mdadm commit that introduced this point of failure refers to breaking changes in the kernel:
commit 329dfc28debb58ffe7bd1967cea00fc583139aca
Date: Mon Nov 4 14:27:49 2019 +1100
Create: add support for RAID0 layouts.
Since Linux 5.4 a layout is needed for RAID0 arrays with varying device sizes. This patch makes the layout of an array visible (via --examine) and sets the layout on newly created arrays. --layout=dangerous can be used to avoid setting a layout so that they array can be used on older kernels.
With mdadm v4.2, I found the error to be implacable even with devices of the same size and with --layout=dangerous:
# mdadm -C /dev/md0 -l 0 -e 0.90 -n 2 /dev/sda1 /dev/sdb1 mdadm: 0.90 metadata does not support layouts for RAID0 # mdadm -p dangerous -C /dev/md0 -l 0 -e 0.90 -n 2 /dev/sda1 /dev/sdb1 mdadm: -p does not set the mode, and so cannot be the first option. # mdadm -C /dev/md0 -p dangerous -l 0 -e 0.90 -n 2 /dev/sda1 /dev/sdb1 mdadm: raid level must be given before layout. # mdadm -C /dev/md0 -l 0 -p dangerous -e 0.90 -n 2 /dev/sda1 /dev/sdb1 mdadm: 0.90 metadata does not support layouts for RAID0
The mdadm man page for option -p, --layout says:
This option configures the fine details of data layout for RAID5, RAID6, and RAID10 arrays, and controls the failure modes for faulty. It can also be used for working around a kernel bug with RAID0, but generally doesn't need to be used explicitly.
...
A bug introduced in Linux 3.14 means that RAID0 arrays with devices of differing sizes started using a different layout. This could lead to data corruption. Since Linux 5.4 (and various stable releases that received backports), the kernel will not accept such an array unless a layout is explictly set. It can be set to 'original' or 'alternate'. When creating a new array, mdadm will select 'original' by default, so the layout does not normally need to be set. An array created for either 'original' or 'alternate' will not be recognized by an (unpatched) kernel prior to 5.4. To create a RAID0 array with devices of differing sizes that can be used on an older kernel, you can set the layout to 'dangerous'. This will use whichever layout the running kernel supports, so the data on the array may become corrupt when changing kernel from pre-3.14 to a later kernel.
So, it seems that RAID-0 autodetect now requires playing games with the versions of kernel and mdadm or patching them to restore the old behavior. Higher RAID levels might work, but I have not tested them.