Setting up a basic Linux ZFS instance

Fri 21 June 2013
By Stephen Cripps

In GNU\Linux

tags: linuxzfs

Subscribe: RSS

Flattr this

I will not be talking about setting up ZFS as the root file system, to do so requires some knowledge of your specific distribution, although the site that hosts native ZFS does have a guide here.

Rather I want to talk about how to set up a quick, simple and reliable data store using the ZFS file system that you can use to create hardware redundancy and snapshots.

This tutorial will walk you through building a file system/storage pool with two hard drives in a mirrored configuration.

Hopefully this will introduce you to enough of the commands so that you can do something cool, and eventually cut your teeth on some of the real documentation.

What is ZFS?

To quote Wikipedia's Article,

ZFS is a combined file system and logical volume manager designed by Sun Microsystems.

The features of ZFS include protection against data corruption, support for high storage capacities, integration of the concepts of filesystem and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs

ZFS has been around for SUN and FreeBSD systems for a long time, it is only recently that a group has been able to write a native version for Linux that is stable enough for everyday use. ZFS combines the concept of volume management and file system management.

ZFS is able to overcome some of the most basic problems with existing file-systems. The maximum volume and file size on ZFS is 16 Exabytes, where a single exabyte is 1024 petabytes. Not to say that modern file systems are bad by any means, it's just that ZFS is capable of some amazing things.

There are some advantages that I really want to stress, especially for those who are rightfully paranoid about their data. Normally when you store something for a prolonged period of time, you don't know whether or not the files have maintained their integrity.

There are ways to check this, but ZFS has a very graceful method of handling this. ZFS is able to protect your data from bit rot, the random flipping of a bit on your hard drive for whatever reason. It is able to detect it and correct it on the fly with some minimal setup. Even if you aren't using your data, just get ZFS to run a scan every week or so, and it will find and fix these errors across the entire data store.

Many more things can be said about ZFS that I won't cover here, but it's definitely worth checking out if you haven't already.

SATA cat

A simple setup

For this you are going to need at least two hard drives, they don't need to be the same size, but the more similar they are, the better.

My setup uses a derivative of Ubuntu 12.04, but this should be possible to set up on any distribution.

First, go to the ZFS on Linux homepage and click on the link for the packages for your system. Installation is pretty straightforward. If you're using Ubuntu, it is a matter of adding the PPA to your system, updating and installing the software, i.e.

> sudo add-apt-repository ppa:zfs-native/stable
> sudo apt-get update
> sudo apt-get install zfs-linux

Next open up a terminal and you should be able to type command like zfs and zpool. These are the commands we are going to use to set up and use the file system entirely.

We are going to create a redundant disk array, similar to RAID-1 called a mirror. I use the word similar, since a mirror is a type of virtual device with ZFS and doesn't need to be initialised the same way a RAID-1 array would in something like mdadm

Disk Partitioning

Now this is more of a personal preference, but something I think you should consider. Normally you would just give ZFS an entire hard drive and let it manage things from there, however if you are just a consumer, you probably don't buy batches of identical hard drives to store your data on. You do need all disks in a mirror to be the same size, so instead of using disks the same size, we will use partitions of the same size.

Plus if you ever dual boot, Windows will be very confused by these ZFS formatted disks (more on dual booting a system with ZFS formatted disks at the end). Instead we are going to give the hard drives a partition table, and create "unformatted" partitions for ZFS to use. This has the additional benefit of being able to have a non ZFS partition on the disks as well, although I wouldn't recommend it.

So with a tool like gparted, you are going to take each disk you want to use and:

  • Make a partition table
  • Create a partition the same size across both disks

For example, once I have done the above using gparted, the disk has the layout illustrated in the following image:

Gparted ZFS layout

Because I want to be sure I had the correct drive selected, I started gparted from the terminal using:

gparted /dev/disk/by-id/ata-ST2000DL003-9VT166_6YD19HEP

Note that there are two partitions, your disk should only have one. The first partition was created with the file system option set to "unformatted". The nice thing about this is that it will prevent the OS from trying to auto-mount it.

The second partition exists because I have two disks of different sizes and I'm gambling by creating an NTFS partition in the available space. The only reason I'm comfortable with this is because the other disk is dedicated to only ZFS. When I boot into Windows on occasion, Window's will offer to reformat the ZFS partitions, you can prevent this from happening if you go into disk administration and remove the letter assignment from the partitions, or offline the drives.

You can actually list multiple disks as arguments, and gparted will only show those drives in the drop down on the top right. It also speeds up the refresh time after every action since it only scans the listed drives. Anyways, I digress.

Creating your first "storage zpool"

Yes, a storage pool. This is the logical volume management part of ZFS. We are now going to tell ZFS to group our identical partitions into a mirror, which we will then later assign ZFS file systems to.

NOTE: With these commands there is no going back! If you mess this up, you will need to zero the beginning of the disk and repeat the disk partitioning step! ZFS Does not have a function to un-add disks, never forget this!

We need a way to reference the disks so that their label never changes. Don't use the standard /dev/sda type labels, since these change depending on what is exposed to the OS on startup; so if you ever move a SATA cable to a different spot, the label will change and ZFS will notify you of the horrible consequences (they're not that bad, but still. Don't do it).

On Ubuntu or similar distributions, use the /dev/disk/by-id or similar device files, for example, on my system, my drives are the following:

> ls /dev/disk/by-id

ata-ST2000DL003-9VT166_6YD19HEP@
ata-ST2000DL003-9VT166_6YD19HEP-part1@
ata-ST2000DL003-9VT166_6YD19HEP-part2@
ata-WDC_WD15EARS-00MVWB0_WD-WMAZA2321219@
ata-WDC_WD15EARS-00MVWB0_WD-WMAZA2321219-part1@

These names will not change, and the names sort of bear similarity to the hardware itself, for example, "WDC" is the Western Digital something something 1.5 TB version.

Now, the commands to create a pool are as follows:

zpool create tank mirror \
    /dev/disk/by-id/ata-ST2000DL003-9VT166_6YD19HEP-part1 \
    /dev/disk/by-id/ata-WDC_WD15EARS-00MVWB0_WD-WMAZA2321219-part1

Now let us break down the above.

zpool create is the command to create a new storage pool

tank is the name I have assigned to the pool

mirror tells zpool that I want to create a mirror for the drives I list immediately afterwards.

\ is a bash character, it just tells the terminal that I want to continue the command on the next line

And finally is the list of drives I want to put in a mirrored configuration. Note the suffix -part1, representing the drive partition. This is important, don't forget this, otherwise zpool will just use the entire hard drive. If you forgot to do this and only wanted to use the drives partition, you will need to delete the pool, zero the beginning of the drive and start again, it's a lot of work.

Creating the actual file system

At this point, our disks are being managed by zpool. They are the raw form we have assigned a name to, that we can use in our file system.

To create a zfs file system on our volume, we do:

> zfs create tank/my-filesystem

Which we then want to mount with:

> zfs mount tank/my-filesystem

The last command may be a little confusing, you just performed a mount command without actually specifying a mount point. ZFS has mounted the file system for you at the root of your system, i.e. /tank/my-filesystem

And thats it, you now have a valid ZFS file system with hardware redundancy to which you can store valuable data.

Check on your data!

There are some basic commands you will want to use to check on the status of your hardware storage pool and file system.

> zpool status

  pool: tank
 state: ONLINE
  scan: scrub repaired 0 in 1h11m with 0 errors on Thu May 30 22:47:10 2013
config:

    NAME                                                STATE     READ WRITE CKSUM
    tank                                                ONLINE       0     0     0
      mirror-0                                          ONLINE       0     0     0
        ata-WDC_WD15EARS-00MVWB0_WD-WMAZA2321219-part1  ONLINE       0     0     0
        ata-ST2000DL003-9VT166_6YD19HEP-part1           ONLINE       0     0     0

errors: No known data errors

zpool status will show you if any of your disks are offline and the numbers on the right will show you the number of errors encountered by ZFS.

> zpool iostat

               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank         504G  1.77T      0      0     29     21

zpool iostat shows you the amount of data being transfered through your storage pool at the time of the command. You can put a number at the end, for instance, zpool iostat 2 and it will repeat the command forever every two seconds. This is good for checking up on the speed during transfers. Also you can see the transfers for individual drives with zpool iostat -v

> zpool scrub tank

zpool scrub tells ZFS to look at the storage pool and scan it for any errors. If an error is found, it automatically corrects it using the data from the mirrored drive. ZFS is able to check for errors by matching each file with a checksum that was generated when it was written. Do this periodically; even though ZFS catches errors like this on reading the file, suppose both copies of a file have been corrupted between a long period without reads.

This is an intensive process, but still allows you to have the file system mounted.

Taking snapshots

ZFS includes the ability to take snapshots of the file system very quickly and without any real cost in space. Its a good way to protect data from accidental deletions and can be automated with something like cron.

> zfs snapshot tank/my-filesystem@some-unique-name

Note that I did not put a leading / in front. Its looking for the name of the "storage pool name"/"zfs file system" rather than the mounted path.

Keep in mind, if something is contained in a snapshot, and you delete it, the file is still taking up space. The file will not actually be deleted until all snapshots containing it are also removed.

> zfs list -t snapshot

zfs list is used to show properties related to the file system, in this case all of the snapshots that have been taken.

Expanding storage in the future

If you need to make your data store larger in the future, you can add it to the storage pool on the fly with no downtime. The command to do so is very similar to the one used to create the pool:

> zpool add tank mirror /dev/... /dev/...

Be extra careful when you're adding drives to an existing pool, as I said before, there is no way to remove the drives from the configuration once they have been added.

There are also other configurations besides mirrored drives. A popular configuration you will see in other tutorials is raidz, which is the same as RAID-5 except it fixes a bug where data could be lost in the tradition RAID-5 setup up.

Caveats and Discussion

RAID is not Backup!

One thing to note here is that you have hardware redundancy, in the case that a drive fails, the file-system will unmount and wait for you to replace the drive (I haven't given the commands to do that here) or mount the system in degraded mode.

However, nothing is stopping some software problem, or accidental command from wiping out all of your data. All it takes is some sudo rm -rf / and there is no guarantee things will be okay. A real backup is disconnected from the system, stored in a fire-proof vault in Nunavut surrounded by grey-beards riding polar bears. Snapshots can help a little though.

Technical requirements

Using the simple set up I have provided, your computer should be able to handle the overhead of running ZFS, but make sure you check the recommended system requirements posted around the web. A better computer will be able to transfer files and handle some of ZFS's more advanced features should you choose to enable them. Keep this in mind should you ever decide to turn an older computer into a fileserver.

Speaking of advanced features, ZFS is able to do some stuff that even Enterprise technology companies love to brag about. ZFS can deduplicate files as they are written to disk at the block level.

You can add SSD's as cache drives, or dedicate drives to storing the intent log. This tutorial barely brushes the surface of what ZFS is capable of, make sure you check out the resources below if you want to learn more.

Resources

http://pthree.org/2012/12/17/zfs-administration-part-x-creating-filesystems/: The documentation provided by the ZFS on Linux site is very helpful.

http://docs.oracle.com/cd/E19253-01/819-5461/: Oracle bought Sun, the original creator of the file system, they have extensive documentation.

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide: Interesting wiki of best practices.

https://wiki.gentoo.org/wiki/ZFS: Thanks to Redditor PonderingGrower for pointing out this good guide to ZFS on the Gentoo Linux wiki

I hope someone finds this guide helpful. If you find any errors in these instructions, please post them in the discussion at the end of this post.

Comments !

blogroll