Building the Ultimate Linux Home Server - Part 1: Intro, MergerFS, and SnapRAID

Jul 24, 2021 · Guide · Self-Hosting · Linux · 10 min. read

ℹ️ Disclaimer

This guide was written in 2021 and reflects my setup and recommendations at the time. Some tools, software versions, and best practices may have changed since then. Consider checking more recent resources alongside this post.

A couple of months ago, while wasting time browsing Reddit, I discovered r/homelab. After looking at the top posts for a couple of weeks, I decided that I also wanted to join in on the action and build my own self-hosted Linux Server.

I soon ran into a couple of issues. You see, as a university student without a full-time job, I really didn’t have any disposable income to spend on expensive server racks, network equipment, and drive arrays.

I did, however, have my old desktop which was collecting dust under my table after I had replaced it with a new laptop, so I decided to experiment and try to use that as my server.

Some of the requirements I wanted this server to fulfil were the following:

Post Contents

Hardware #

The PC I used by no means contains server-grade hardware, but since I had originally built it as a gaming computer, it did the job well enough. Its specs are as follows:

I could use the system as-is, but I decided to turn off the dedicated GPU in the BIOS since it didn’t have any real purpose, and I didn’t want to use as much power as a small city.

As far as storage goes, I installed the operating system on the SSD and used a bunch of good-ol’ spinning rust hard drives for everything else since most files would remain unchanged.

Operating System #

There is a lot of debate about which Linux distro is best to use for a server. Most people would recommend something like Ubuntu Server or Debian because of their stability, however, I decided to use Archbtw for a couple of really specific reasons:

Despite that, most of the things I will talk about will be the same or similar, independent of what distro you decide to use.

The OS installation itself is beyond the scope of this article since there are lots of better-written guides that explain the process (The Arch Wiki, for example).

MergerFS #

There are definitely lots of different ways to go about setting up the storage system for your server. You could create a hardware RAID array, try out ZFS, or simply mount all your drives in your home directory and manage them manually.

However, none of these solutions is nearly as flexible, inexpensive, and easy to use as MergerFS.

Mergerfs is a union filesystem geared towards simplifying storage and management of files across numerous commodity storage devices. It is similar to mhddfs, unionfs, and aufs.

In layman’s terms, MergerFS allows you to combine drives of different sizes and speeds into a single mountpoint, automatically managing how files are stored in the background.

1TB       +      2TB      =       3TB
/disk1           /disk2           /merged
|                |                |
+-- /dir1        +-- /dir1        +-- /dir1
|   |            |   |            |   |
|   +-- file1    |   +-- file2    |   +-- file1
|                |   +-- file3    |   +-- file2
+-- /dir2        |                |   +-- file3
|   |            +-- /dir3        |
|   +-- file4        |            +-- /dir2
|                     +-- file5   |   |
+-- file6                         |   +-- file4
                                  |
                                  +-- /dir3
                                  |   |
                                  |   +-- file5
                                  |
                                  +-- file6

Of course, there is a minor performance overhead when using this approach but as far as home servers are concerned, the advantages outweigh the disadvantages.

My Setup #

For my system I have 3 drives in total formatted as ext4, excluding the boot drive:

The three main storage drives are then pooled into a new directory called /mnt/storage where all my files can be accessed from. /mnt/storage contains the following subdirectories:

Installation #

Getting MergerFS up and running is pretty simple since you just need to install it using your preferred package manager, edit your fstab file, and reboot.

yay -S mergerfs

After installing it, you need to find the disk IDs since the mapping of a device to a drive letter is not guaranteed to always be the same, even with the same hardware and configuration: ls /dev/disk/by-id

Running this command will return something similar to this:

ls /dev/disk/by-id
ata-Optiarc_DVD_RW_AD-5240S                          wwn-0x50014ee20d526ebe
ata-Samsung_SSD_860_EVO_250GB_S3YJNB0K512940F        wwn-0x50014ee20d526ebe-part1
ata-Samsung_SSD_860_EVO_250GB_S3YJNB0K512940F-part1  wwn-0x50014ee21329b6cf
ata-Samsung_SSD_860_EVO_250GB_S3YJNB0K512940F-part2  wwn-0x50014ee21329b6cf-part1
ata-WDC_WD10EZRZ-00HTKB0_WD-WCC4J2AXKT3R             wwn-0x50014ee2bcff024d
ata-WDC_WD10EZRZ-00HTKB0_WD-WCC4J2AXKT3R-part1       wwn-0x50014ee2bcff024d-part1
ata-WDC_WD40EFAX-68JH4N0_WD-WX12D80N59SR             wwn-0x5002538e403be893
ata-WDC_WD40EFAX-68JH4N0_WD-WX12D80N59SR-part1       wwn-0x5002538e403be893-part1
ata-WDC_WD40EFAX-68JH4N0_WD-WX52D104L73P             wwn-0x5002538e403be893-part2
ata-WDC_WD40EFAX-68JH4N0_WD-WX52D104L73P-part1

What we are interested in are the lines containing the IDs of the partitions themselves instead of the entire drives (the ata-xxx-part1 lines), so in this case:

ata-WDC_WD10EZRZ-00HTKB0_WD-WCC4J2AXKT3R-part1
ata-WDC_WD40EFAX-68JH4N0_WD-WX12D80N59SR-part1
ata-WDC_WD40EFAX-68JH4N0_WD-WX52D104L73P-part1

Next, edit your fstab file, mount the partitions (including the parity drive) and create the MergerFS pool.

...
# hard drives
/dev/disk/by-id/ata-WDC_WD10EZRZ-00HTKB0_WD-WCC4J2AXKT3R-part1 /mnt/disk1 	ext4 defaults 0 0
/dev/disk/by-id/ata-WDC_WD40EFAX-68JH4N0_WD-WX52D104L73P-part1 /mnt/disk2 	ext4 defaults 0 0
/dev/disk/by-id/ata-WDC_WD40EFAX-68JH4N0_WD-WX12D80N59SR-part1 /mnt/parity1 	ext4 defaults 0 0

# mergerfs
/mnt/disk* /mnt/storage fuse.mergerfs defaults,dropcacheonclose=true,allow_other,minfreespace=25G,fsname=mergerfs 0 0
...

You can find a full list of options for your storage pool here.

After editing your fstab file, save and reboot. If everything went well, you should be able to create a file in /mnt/storage and see that the file was actually stored in one of the /mnt/diskX directories.

SnapRAID #

We have now finished setting up our storage pool, but what happens when one of our drives inevitably fails? This is where SnapRAID comes into play. Remember the parity drive we left unused until now? Well, this drive won’t actually store any of your data, instead, it will hold parity information used to recover data if any disk dies.

Keep in mind that using SnapRAID has a couple of caveats:

Installation #

yay -S snapraid

Next, create/edit your SnapRAID configuration file:

# Defines the file to use as parity storage
# It must NOT be in a data disk
parity /mnt/parity1/snapraid.parity

# Defines the files to use as content list
# You can use multiple specification to store more copies
# You must have least one copy for each parity file plus one.
# They can be in the disks used for data, parity or boot,
# but each file must be in a different disk.
content /var/snapraid.content
content /mnt/parity1/.snapraid.content
content /mnt/disk1/.snapraid.content
content /mnt/disk2/.snapraid.content

# Defines the data disks to use
# The order is relevant for parity, do not change it
disk d1 /mnt/disk1
disk d2 /mnt/disk2

# Excludes hidden files and directories (uncomment to enable).
#nohidden

# Defines files and directories to exclude
# Remember that all the paths are relative at the mount points
# Format: "exclude FILE"
# Format: "exclude DIR/"
# Format: "exclude /PATH/FILE"
# Format: "exclude /PATH/DIR/"
exclude /lost+found/

# You might also want to exclude any log files or temporary DB files since these are changed frequently and might mess with the parity file.
exclude *wal

After editing /etc/snapraid.conf, try running snapraid sync as root to check if everything is configured correctly. Just keep in mind that this first sync could take a long time depending on the size of your drives.

Automation #

Since one of the requirements at the start of the article was automated backups, we are going to use snapraid-runner to run a parity sync once every week:

git clone https://github.com/Chronial/snapraid-runner.git /opt/snapraid-runner

After that create your configuration file (just make sure to fill in your email settings):

[snapraid]
; path to the snapraid executable (e.g. /bin/snapraid)
executable = /usr/bin/snapraid
; path to the snapraid config to be used
config = /etc/snapraid.conf
; abort operation if there are more deletes than this, set to -1 to disable
deletethreshold = -1
; if you want touch to be ran each time
touch = false

[logging]
; logfile to write to, leave empty to disable
file = /var/log/snapraid.log
; maximum logfile size in KiB, leave empty for infinite
maxsize = 5000

[email]
; when to send an email, comma-separated list of [success, error]
sendon = success,error
; set to false to get full programm output via email
short = true
subject = [SnapRAID] Status Report
from = {fill in}
to = {fill in}
; maximum email size in KiB
maxsize = 500

[smtp]
host = {fill in}
; leave empty for default port
port = {fill in}
; set to "true" to activate
ssl = {fill in}
tls = {fill in}
user = {fill in}
password = {fill in}

[scrub]
; set to true to run scrub after sync
enabled = true
percentage = 22
older-than = 12

We are then going to use Cron to call snapraid-runner once every week, specifically at 12:00 every Sunday: sudo crontab -e

...
0 12 * * 0 python /opt/snapraid-runner/snapraid-runner.py --conf /etc/snapraid-runner.conf
...

After saving the crontab file, SnapRAID will automatically back up your drives every week!

Final Thoughts #

Just by installing and configuring these two tools, we have managed to satisfy the first 4 requirements for our home server. We could stop right here and be good to go. However, there are a couple of things I strongly recommend doing before starting to host any services and exposing your server to the public:

In the next part, we are going to be setting up Docker and Portainer for container management, Watchtower for automatic container updates, and OpenVPN for remote server management.