Replacing a broken disc in ZFS (on linux)

I knew that this would happen. One of the discs in my NAS died. Sadly it was one of those discs I just bought as in lets-give-it-a-try. A Toshiba DT01ACA200. This disc died just after 10 months (and I did not use my NAS 24/7 nor do I have high iops or many writes).

ZFS itself performs pretty nice even with the missing disc. I haven’t tested for performance though it does feel as if there is no decreased speed. zpool status shows:

root@janice:~# zpool status
  pool: storage
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub canceled on Wed Nov 11 00:50:26 2015
config:
 
        NAME                                              STATE     READ WRITE CKSUM
        storage                                           DEGRADED     0     0     0
          mirror-0                                        ONLINE       0     0     0
            scsi-3600050e086e0080099480000acb70000-part1  ONLINE       0     0     0
            scsi-3600050e086e3640034360000c9810000-part1  ONLINE       0     0     0
          mirror-1                                        DEGRADED     0     0     0
            15444594169386502563                          UNAVAIL      0     0     0  
                                                was /dev/disk/by-id/ata-TOSHIBA_DT01ACA200_xxx-part1
            ata-ST2000VX000-1ES164_xxx-part1              ONLINE       0     0     0
          mirror-2                                        ONLINE       0     0     0
            ata-TOSHIBA_DT01ACA200_xxx-part1              ONLINE       0     0     0
            ata-ST2000VX000-1ES164_xxx-part1              ONLINE       0     0     0
        logs
          ata-ADATA_SP909_xxx-part4                       ONLINE       0     0     0
        cache
          ata-ADATA_SP900_xxx-part5                       ONLINE       0     0     0
 
errors: No known data errors

Actually this is a pretty mixed setup. Not how one „should“ do it. I am mixing different manufacturers, two discs are on a 3ware raid controller (doing jbod) and I am using a partioned SSD (single, no redundancy) as cache and zil. No, you really shouldn’t do that. Anyway, that’s not what this post is about. This post is about replacing the broken disc.

Identifying the broken disc

18.11.15 - 1As you can see in mirror-1 the Toshiba died. I’ve partioned the discs because that way I was able to use custom labels easily for my discs and I was able to set the discs model and serial – Which does help me a lot to easily and shortly identify the broken disc. zpool status gives me all I need that way.

Replacing the broken disc

Well – No hotswap here, hence I am powering off the machine for the physical replacement. With hotswap you’d need a few additional steps. The new drive has to be partitioned first:

gdisk /dev/sdc
n (new partition)
1 (partition 1)
2048 (start...)
3907029134 (end.. just take the default here)
BF00 (partition type, solaris root)
c (change partition name)
1 (first partition)
MODEL_SERIAL (for example ST2000VX000-1ES164_xxx)
w (write and exit)

Now you should be able to see that in /dev/disk/by-partlabel:

root@janice:~# ls /dev/disk/by-partlabel/ -la
total 0
drwxr-xr-x 2 root root 220 Nov 18 20:45 .
drwxr-xr-x 8 root root 160 Nov 18 20:37 ..
lrwxrwxrwx 1 root root  10 Nov 18 20:37 DT01ACA200-xxx -> ../../sdg1
lrwxrwxrwx 1 root root  10 Nov 18 20:37 DT01ACA200-xxx -> ../../sde1
lrwxrwxrwx 1 root root  10 Nov 18 20:37 rootfs -> ../../sda2
lrwxrwxrwx 1 root root  10 Nov 18 20:37 ST2000DM001-1ER164-xxx -> ../../sdf1
lrwxrwxrwx 1 root root  10 Nov 18 20:37 ST2000VX000-1ES164-xxx -> ../../sdb1
lrwxrwxrwx 1 root root  10 Nov 18 20:37 ST2000VX000-1ES164-xxx -> ../../sdd1
lrwxrwxrwx 1 root root  10 Nov 18 20:45 ST2000VX000-1ES164_xxx -> ../../sdc1
lrwxrwxrwx 1 root root  10 Nov 18 20:37 zfs-cache -> ../../sda5
lrwxrwxrwx 1 root root  10 Nov 18 20:37 zfs-log -> ../../sda4

zpool replace

the syntax for this command does show:

zpool replace [-f] [-o property=value] pool device [new_device]

The correct command should hence be zpool replace -o ashift=12 storage 15444594169386502563 ST2000VX000-1ES164_xxx Not sure if the ashift is required when doing a replace – I guess it wouldn’t harm. I do that due to the 4K drive. However, with that command it does moan that the device has to be supplied by its full path:

zpool replace -o ashift=12 storage 15444594169386502563 /dev/disk/by-partlabel/ST2000VX000-1ES164_xxx

Verifying

Let’s see what zpool status shows:

root@janice:~# zpool status
  pool: storage
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Nov 18 20:52:45 2015
        79.7M scanned out of 2.72T at 5.31M/s, 149h15m to go
        26.1M resilvered, 0.00% done
config:
 
        NAME                                              STATE     READ WRITE CKSUM
        storage                                           DEGRADED     0     0     0
          mirror-0                                        ONLINE       0     0     0
            scsi-3600050e086e0080099480000acb70000-part1  ONLINE       0     0     0
            scsi-3600050e086e3640034360000c9810000-part1  ONLINE       0     0     0
          mirror-1                                        DEGRADED     0     0     0
            replacing-0                                   UNAVAIL      0     0     0
              15444594169386502563                        UNAVAIL      0     0     0  
                                               was /dev/disk/by-id/ata-TOSHIBA_DT01ACA200_xxx-part1
              ST2000VX000-1ES164_xxx                      ONLINE       0     0     0  (resilvering)
            ata-ST2000VX000-1ES164_xxx-part1              ONLINE       0     0     0
          mirror-2                                        ONLINE       0     0     0
            ata-TOSHIBA_DT01ACA200_xxx-part1              ONLINE       0     0     0
            ata-ST2000VX000-1ES164_xxx-part1              ONLINE       0     0     0
        logs
          ata-ADATA_SP900_xxx-part4                       ONLINE       0     0     0
        cache
          ata-ADATA_SP900_xxx-part5                       ONLINE       0     0     0
 
errors: No known data errors

I really like the way zpool status displays its output. Looks good I’d say. Give it a few minutes and re-check with zpool status to get a more accurate bandwidth value:

  scan: resilver in progress since Wed Nov 18 20:52:45 2015
        60.4G scanned out of 2.72T at 313M/s, 2h28m to go
        20.1G resilvered, 2.17% done

Oh yeah. That’s a little bit faster than the initial 5,31M/s :^) A cigarette break later…

  scan: resilver in progress since Wed Nov 18 20:52:45 2015
        191G scanned out of 2.72T at 365M/s, 2h1m to go
        63.7G resilvered, 6.85% done

Just in case this might be interesting to someone:

root@janice:~# zpool iostat -v
                                                     capacity     operations    bandwidth
pool                                              alloc   free   read  write   read  write
------------------------------------------------  -----  -----  -----  -----  -----  -----
storage                                           2.72T  2.72T    610     29  47.1M   122K
  mirror                                           929G   927G     19      4  46.7K  27.8K
    scsi-3600050e086e0080099480000acb70000-part1      -      -      8      1  60.3K  37.4K
    scsi-3600050e086e3640034360000c9810000-part1      -      -      8      1  59.2K  37.4K
  mirror                                           929G   927G    571     19  47.0M  65.0K
    replacing                                         -      -      0  1.46K      0   122M
      15444594169386502563                            -      -      0      0      0      0
      ST2000VX000-1ES164_xxxx                         -      -      0  1.02K     43   122M
    ata-ST2000VX000-1ES164_xxxx-part1                 -      -    397      1  47.3M  37.7K
  mirror                                           929G   927G     19      4  46.8K  27.8K
    ata-TOSHIBA_DT01ACA200_xxxx-part1                 -      -      9      2  64.0K  37.4K
    ata-ST2000VX000-1ES164_xxxx-part1                 -      -      7      1  52.1K  37.4K
logs                                                  -      -      -      -      -      -
  ata-ADATA_SP900_xxxx-part4                       536K  7.94G      0      0   1014   1014
cache                                                 -      -      -      -      -      -
  ata-ADATA_SP900_xxxx-part5                       411K  29.7G      0      0     94  1.86K
------------------------------------------------  -----  -----  -----  -----  -----  -----

Once finished, it does look like this again:

root@janice:~# zpool status
  pool: storage
 state: ONLINE
  scan: resilvered 929G in 1h53m with 0 errors on Wed Nov 18 22:46:25 2015
config:
 
        NAME                                              STATE     READ WRITE CKSUM
        storage                                           ONLINE       0     0     0
          mirror-0                                        ONLINE       0     0     0
            scsi-3600050e086e0080099480000acb70000-part1  ONLINE       0     0     0
            scsi-3600050e086e3640034360000c9810000-part1  ONLINE       0     0     0
          mirror-1                                        ONLINE       0     0     0
            ST2000VX000-1ES164_xxxx                   ONLINE       0     0     0
            ata-ST2000VX000-1ES164_xxxx-part1         ONLINE       0     0     0
          mirror-2                                        ONLINE       0     0     0
            ata-TOSHIBA_DT01ACA200_xxxx-part1        ONLINE       0     0     0
            ata-ST2000VX000-1ES164_xxxx-part1         ONLINE       0     0     0
        logs
          ata-ADATA_SP900_xxxx-part4              ONLINE       0     0     0
        cache
          ata-ADATA_SP900_xxxx-part5              ONLINE       0     0     0
 
errors: No known data errors

Not sure why there is no ata- in front of the new disc. Probably it’ll be there right after the next reboot – or I did use something else instead of /dev/disk/by-partlabel. I’ll take a deeper look into that, soon.

No Comments

Post a Comment