comment 0

Don’t MAZE Yourself. Logically Label your disks for ZFS Pools

Being a sysadmin can be overwhelming. I was working on a 0.5 Petabyte system which was set up on a JBOD ZFS System. Some of the disks have failed. The previous sysadmin did an awesome(sarcastic) job in naming things. In a hurry to bring up the system online, he forgot to label anything. Neither physically nor logically. There is no way to track down the hard drive that has failed. Unless I hire student workers to physically pull down each drive and compare the serial number. Imagine, 0.5 petabytes, 1.5 petabytes raw storage, and 4 terabytes hard drives. Do the math.’

Don’t trap yourself while you are installing a new storage system. Do the right thing. Save yourself and save others!

Heres what you should do!

Come up with a naming convention.

This is important, how easy it would be if you can directly pinpoint which disk has failed. What I generally do is to come up with a grid naming system.

This is how a chunk of my ‘zpool status’ looks like

NAME                 STATE     READ WRITE CKSUM
	nemesis-zfs-pool     ONLINE       0     0     0
	  mirror-0           ONLINE       0     0     0
	    J913-4520424821  ONLINE       0     0     0
	    J914-4520422622  ONLINE       0     0     0

Did you notice the label on the disks?

‘J913-4520424821’ What does that mean?

J -> Its on a JBOD and not an Embedded disk on a server.

9-> is the JBOD unit ID. I have labelled each JBOD that I use from 0-9, A-Z

13 -> Means it is on the 1st row from the top and 3rd disk from left.

4520424821-> Is the last 10 digits of the disk serial.

Now if I need to replace a disk, I can ask someone to go to the server room and pull out the 3rd disk, from row 1 of JBOD9. And insert the new disk I had given him.

Hoila! So simple. So how do we label this disks?

Use GPT Partition to label your disks.

On CentOS, I use parted and on FreeBSD I use gpart. Both achieve the same goal

For CentOS

Do a lsblk to view the disks.

then:-

1. parted /dev/sdx
2. mklabel gpt
3. unit GB
4. mkpart primary 0.00GB 900GB
5. name 1 J913-4520424821

The ‘2’ mklabel gpt creates the ‘gpt’ partitioning scheme for the disk.

4, mkpart primary 0.00GB 900GB creates a primary partition starting from 0.00GB of the disk to 900GB of the disk.

Here is another stupid mistake to avoid. If you are using ZFS you can only replace a disk with either the same size or a larger disk. I had a 1TB disk of WD that had usable space of 913 GB. However other vendors like SEGATE might have a 1TB disk with usable 906 GB. So If I had to replace it I am pretty much making myself unavailable to disks smaller than 913 GB.

Trust me, always leave some GB’s out of your partition towards the end. In my case, I partitioned it till 900GB. Keep those ‘bytes’ free or else someday it will ‘bite’ your back. Pun intended!

Create ZFS pool using the GPT LABEL

You can access your disks at /dev/disk/by-partlabel/—-

So I would use the following to create a small pool

zpool create nemesis-zfs-pool mirror /dev/disk/by-partlabel/J913-4520424821 /dev/disk/by-partlabel/J914-4520422622

There are other things to take care of if you are seriously creating a big pool such as disk allignment, 4k sectors etc. This blog was to encourage you to use a disk labeling nomencleture that suits you.
Dont trap yourself in a maze!
Thank You! smile

Leave a Reply

Your email address will not be published. Required fields are marked *