Rebuilding NAS with ZFS & Docker

It all started with tripped breaker. In fact it tripped multiple times before I got home. Luckilly the spinning rust drives with data survived that without any issues, but the flash drive didn't. So with system FUBAR and some (little) time on my hands, there's only one thing to do - upgrade.

My NAS setup

Now let me describe my home NAS setup first. All is running on this nice, silent and tiny HP Microserver Gen8. I have the low spec model with Intel Celeron G1610T and 10GB ECC RAM. (2GB from HP + 8GB that I bought extra, there's a plan to swap 2GB module with 8GB one, but so far there's no need for that) There are 4 HDDs that are purely used for data (Using ZFS on Linux) and the system boots from flash drive, which is basically just bare Ubuntu 14.04 installation with ZFS and Docker.

Now what's sort of unusual (I guess) is, that the base system is really just bare OS installation with as little extra services running on it as possible. That means that all the services, that I actually use on that NAS (Plex, Samba, Graphite,..) are running in docker containers. One reason is isolation from each other, the other is that I don't have to care about the flash drive system that much - in theory I should be able to swap it, upgrade it, etc. without much hassle. Up until today, that was just theory, now that I had dead flash drive in my hand, I was about to test it.

Let's do it

So I've poped in fresh microSD card into the reader (server has a reader on the motherboard and it's also able to boot from it, which is nice), downloaded the latest Ubuntu 16.04 Server installation ISO and booted it from freshly prepared flash drive.

Oh and by the way, I've did the whole installation over iLO console as connecting a screen would be too much hassle - for less than €280 (~$311) this is trully an outstanding piece of hardware.

If you ever installed Ubuntu server, you know it's a smooth sail. The only extra step I had to do was manual partitioning. About 15 minutes later, there is a fresh installation booted up.

Setting up

Did I say I love ansible? Well I do. I've ran it with my playbooks to configure access (ssh keys, paswordless sudo,..), install some common utilities (Vim and such) and add some personal settings. (Vim profile, bash aliases,..) About 3 minutes later I can already feel like at home with all the settings as I like them. But that's just for convenience, now let's get the services up.

First get ZFS on board. Previously I've used ZFS on Linux repository, but the latest LTS release is shipping with ZFS support built in, so it was as easy as:

sudo apt install zfsutils-linux

Now I have all my data accessible - and that includes my docker images as I'm using docker on ZFS. Docker itself was already installed by Ansible, so for the future convenience I'm adding in ZFS installation to the playbooks and continue.

Systemd

Previously I was using upstart to spin up docker containers (and keep them running) and in theory I could do the same here and be done at this stage. (Systemd tries to be backwards compatible) But hey. It's not even half an hour since I've started, so let's spend some time and rewrite the upstart configuration to systemd, so we're future-proof. (and learn something new)

To be honest, I was a bit worried. There's a lot of hate for systemd out there and I had little to no previous experience with it. But I have to say, the configuration is very short and elegant:

[Unit]
Description=Plex in Docker
After=docker.service
Requires=docker.service

[Service]
TimeoutStartSec=0
ExecStartpre=-/usr/bin/docker stop plex
ExecStartpre=-/usr/bin/docker rm plex
ExecStart=/usr/bin/docker run --rm -t \
  -v /data/media/:/media/ \
  -v /data/system/plex/:/config/ \
  --net host \
  --name plex plex
ExecStop=-/usr/bin/docker stop -t 3 plex
ExecStop=-/usr/bin/docker rm plex
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

The way it works is, it always tries to clean up (stop, remove) any left over running container (perhaps after an unexpected shutdown) and then start a new one with proper configuration. On stop, it also removes the container. (in case --rm failed to do that for some reason) Dash at the beginning of the command will ignore any failures, so we're really running most of these just in case - they are expected to fail most of the time.

And we're done

At the end, about an hour later I have fresh, new OS (but now using distribution provided ZFS and Docker) with all the services running. And that includes the time I've spent learning enough about systemd to migrate from upstart. Partially that's due to great work Ubuntu guys did - it's been a great experience, even with the very recently added stuff, like ZFS. On the other hand having the services in Docker with configuration in init scripts and separate volumes made the whole upgrade a non-event. Ansible was just a small automation cherry on top.

For a small home NAS, that's really not bad. It turns out, making some good design decisions in the past might pay off a lot in the future. I still have some plans to automate this stuff even further with ansible so hopefully next time, I'll have my tea, while the whole thing rebuilds.

10/10 would reinstall the whole OS again.