Setting up ZFS with 3 out of 4 discs (raidz)

Hey,

The usual problem, if you want to set up for example a software raid 5 – You got only 3 discs, you’re first creating the raid array with only 2 and one failed disk. Later then you add the third disk. Something similar is also possible with ZFS using a sparse file. In the following Article i’ll explain how.

Important: Create a backup first! You might lose all your data. If you continue you do that on your own risk. You’ve been warned!

The first Step, is creating a sparsefile. A sparsefile doesn’t eat any real space. Ago some years techniques like this have been used at various attacks. For example you could make a sparsefile with a size of 20 TB – Then you zip it and it’ll be just a few KB big (if at all) – Now someone who’s trying to unzip it will end up with a full harddisc (Don’t do that). However, back to how to create that sparsefile. My Discs have a size of 500 GB, so i need to create a sparsefile which is somewhat bigger, for example 512 GB (in an array always the smallest disc is used, so if i’d create a 450 GB sparsefile, i’d lose 50 GB of each disc – That’s why it’s important that the sparsefile is bigger than the smallest disc in your array). You can create that sparsefile like this:

dd if=/dev/zero of=/zfs1 bs=1 count=1 seek=512G

Now you can create the raidz – I’ll use /dev/sda /dev/sdb /dev/sdc and /dev/sdd later (4×500 GB) for now, as all my data is on sda i can’t use that disk, so i define the sparsefile instead. So to create a raidz i’d use:

zpool create zfspool raidz /dev/sdb /dev/sdc /dev/sdd /zfs1

Now, before we do anything with that thing it’s really important to set the sparsefile offline – This is like setting one disk faulty in a software raid. We can do this using this command:

zpool offline zfspool /zfs1

If you don’t do this, the next steps might fill /zfs1 so you’ll end up with a full disk and/or other problems. Let’s take a look at our pool now:

# zpool status
  pool: zfspool
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
 scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zfspool     DEGRADED     0     0     0
          raidz1-0  DEGRADED     0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0
            sdd     ONLINE       0     0     0
            /zfs1   OFFLINE      0     0     0

errors: No known data errors

and:

# zpool list
NAME      SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
zfspool  1.81T   2.47M  1.81T     0%  1.00x  DEGRADED  -

Looks good, or? 🙂 However, as i’m using the linux variant (zfs on linux) i need to create a zvol, find out how big your array is and issue it after -V with for example M for MegaByte. You might also use another blocksize (volblocksize) the default is 8k, however, ext4 which i’m using picks 4k so i’d guess it can’t harm if i take 4k everywhere. Also i’d like to enable gzip compression:

#create the filesystem with 4K blocksize
zfs create -V 1316545M -b 4K zfspool/wdp
# enable gzip compression
zfs set compression=gzip zfspool/wdp
# take a look at it
zfs list
NAME          USED  AVAIL  REFER  MOUNTPOINT
zfspool      1.33T  3.25M  31.4K  /zfspool
zfspool/wdp  1.33T  1.33T  23.9K  -

Now make your favorite filesystem ontop of the zvol. I used mkfs.ext4. You might also create a partition table first, for example cfdisk /dev/zfspool/wdp – If you use cfdisk onto that, don’t use compression, otherwise your box will crash, at least mine does.

Important: Currently there’s a BUG in the ZFS on Linux stuff, so make sure that you disabled the Cache for your device by issuing:

zfs set primarycache=none zfspool/wdp

Also you should insmod/modprobe the module with limited cache to 1 GB:

modprobe zfs zfs_arc_max=1073741824

If you don’t do this two steps, you might end up with a hanging system which doesn’t respond at all anymore. However, the filesystem:

mkfs.ext4 /dev/zfspool/wdp

mount it finally and copy over your files, for example:

mount /dev/zfspool/wdp /mnt

Right now you should reboot, check whether everything works as expected, make sure you got backups of your data to not lose anything important and copy over your files from sda to /mnt/ (where you mounted your device again i hope) as soon as that’s done just check it once again and finally you can replace /zfs1 with the real disc which is used for it.

now umount it if mounted and the following command should do the trick:

zpool replace zfspool /zfs1 /dev/newdisk

check with zpool status whether it’s doing everything correctly. It should resilver automatically, if it’s done you can issue a scrub, just to make sure that everything’s correct.

No Comments

Post a Comment