au QuaStation - Long Term Reliability Considerations

 When setting up a QuaStation, I basically broke the storage into 3 sections:

1. Kernel and related files (files loaded directly by the bootloader), I call this the "Boot" files.  

2. System files used by linux, which is mounted on /, so we call this the "Root" filesystem.  

3. User data, i.e. the data we want to store or back up that has nothing to do with the system itself, in my case, mounted on /home, so I would just call this "user data", or "non-system data".  

It is already cosnidered best practice by many people to separate end user data from system data, so it is fairly common to see separate partitions for / and /home even on desktop systems.  

Where the QuaStation is a bit special is that it isn't possible to easily load a kernel from an ext4 filesystem, or to load a kernel from an SATA devide at all.  

This basically means that even if we wanted to merge the root and user filesystems, we would still need to handle the Linux kernel and other boot files separately.  

Theoretically, we could handle all of these files on a device other than the SATA device, but since the SATA device will almost certainly have the majority of the storage capacity, it would be silly to ignore that storage.  

The kernel can be booted from one of three devices:

a. Internal flash memory (eMMC) - This is how the stock configuration works

b. External MMC/SD Card - This is also possible.

c. USB Memory - This seems to be mainly for emergency recovery purposes, but works well enough.  

A possible reason to use the eMMC to store the kernel is that it doesn't require any additional hardware (or cost), and is resistant to physical tampering.  Disadvantages include the inability to easily swap out the kernel (particularly if the machine won't boot), and the fact that the U-boot boot loader is also stored on the eMMC, so a typo could brick the machine.  

Storing the kernel on an SD Card or USB drive makes it easier to take out the drive for examination or updating the kernel on another machine, so that is what I decided to do.

Booting from USB is slightly easier, and in general prices  were slightly lower for USB devices, so I decided to stick with USB.  

Moving onto the root filesystem, there are several possible approaches:

1. Store on an external USB Disk or SD Card (Which could be the same as that housing the Kernel, or separate)

2. Store on the SATA device, on a separate partition from the user data

3.  Store on  the SATA device, on the same partition as the user data

Option 3 isn't great since it mixes the user data and operating system,  which has numerous disadvantages.  The sole advantage is that it allows us to have the OS on the hard drive without taking up reserved space that is unused on a separate partition.  For example, if we set up a 64GB root partition in order to leave room for future expansion, but we only actually use 32gb, then we are wasting the remaining 32gb.  Note that this limitation can be aleviated with BTRFS, since containers can be used instead of partitions.  

Option 2 is great in some ways, but it decreases the amount of space left for storage.  a 1TB drive is actually only 1 TiB to start with, and then once you account for the metadata of the filesystem itself, you are left wth even less space.  If you set up another partition for the OS, then you are left with even less usable space for user data.  For example, on one of the earliest machines I set up, I partitioned the drive to have a 16GB for the root filesystem and created a user partition for the rest of the data.  This left only 916 GB of space for actual user data.  

Another major drawback is that hard drives eventually die.  If the operating system is on the hard drive, then it will be lost too.  My idea is that when the hard drive dies, I should be able to swap it out for a new one, and just copy the user data back from another machine.  Specifically, if the drive dies, I want to put in a new one, connect to the machine, format the drive, and sync the folders from other machines to restore the node.  

So I was left to conclude that Option 1 was the best option.  Have the OS root partition on an external device, so that failure of the HDD wouldn't negatively affect it.   We all know that HDDs fail more often than flash memory because HDDs are physical devices with moving parts - right?  Well, not so fast.  

Based on the considerations above, most of the machines I set up were set to boot the kernel from USB, and most of the recent ones were also set up to mount the root filesystem from USB as well, and then mount the /home folder from the SATA device.  Part of this was based on the fact that even the very cheapest USB drive cost something, and for only a very modest cost increase I could have a drive large enough to comfortably host the root filesystem.  If any issue occured, theoretically I could just pop the USB drive out, mount it on my normal PC and diagnose it from there.  

I set up the first QuaStation machines in 2021, and some of them started failing mysteriously.  I discovered that one of the reasons was the unreliability of Avahi.  Ideally, the machines would get an IP address via DHCP, and then Avahi would then advertise the hostname on the network, so for example, QuaStation1 would get an address like 192.168.1.56, and advertise itself on the network as quastation1.local.  

When pings to quaststaion1.local would fail to resolve, I knew I had a problem.  Interestingly enough, ResilioSync would still often list the machine as active on the network, and sure enough, pings and SSH connections to that IP would usually work.  

I solved this using two methods:

1. I hard coded the IP address of each machine.  This removed the dependency on DHCP, and meant that I would always know the IP address of each machine, even if ResilioSync was down for some reason.

2. I created a cron job to restart the Avahi daemon once per day.

Strangely, though, occasionally there were machines that would respond to ping requests, but would fail SSH connections.  Power cycling these machines would sometimes fix the problem, but would result in them not even responding to pings anymore.  

Since I couldn't connect to the serial console, I had no idea what might be happening.  

Looking at the USB drives might not reveal anything unusual, so I would be forced to remove the machines from service, disassemble them, and connect to the serial port to debug the issues.  Hardly ideal.

It's turned out in at least one case that the boot drive (USB drive) was totally dead.  Watching the serial console would shot that the kernel file was invalid data.  Closer inspection showed that the kernel was never actually loaded into memory to begin with.  Trying to mount the USB drive on my PC showed that while the device did register with the computer, attempts to read it resulted in it disconnecting from the USB bus.  Even trying to use cfdisk to get a list of partitions did not work.  Of course this was never going to boot!  

The short term fix, then, was simply to grab a new USB disk, partition it, and then copy the kernel and root filesystem from a working machine.  With this, I could insert it into the QuaStation and boot it up without issues.  Once it was booted up, I changed the networking, host name, regenerated the SSH keys, reset the ResilioSync configuration, etc.  The problem was of course that this is only a temporary fix until the new USB drive dies.  

So why did the drive die?

Well, let's look at how the stock system works.  I haven't done a huge amount of digging, but I do know that the kernel is booted from the eMMC.  This shouldn't cause much wear and tear, becuase it is only read, never written, and only read once per boot up.  

The situation with the root filesystem is less clear, but it seems to be run from the hard drive.  It would appear that it mounts the eMMC read only, and it doesn't run like a normal distribution might by mounting the root from the eMMC in full read/write mode and running from there.  Some persistant data is stored, probably by storing it on the hard disk.  

This is a reasonable design, as it means that the eMMC won't wear out, and when the hard drive dies, it can simply be replaced by a new one.  

A stock linux server distribution will write to the root device though, perhaps quite often.  Even if swap is not used, system logs and such will be written to quite often.  I see lots of messages in dmesg about rtc and ethernet, among other things.  Also, resilio-sync stores key data, etc. outside of the home directory.  

So, much of this activity could be removed with a bit of tinkering by filtering what makes it into the system log, and moving the resilio-sync state onto the HDD.  Even so, there is still the reality that a USB flash drive is not designed to be constantly written to 24 hours a day 365 days per year for years at a time.  The typical 10,000 write cycle lifetime we hear for SSDs sounds pretty good, until you realize that dividing that by 365 days comes out to just over 27 writes per day.  That's less than 2 writes per hour.  Also, while I am not able to find any information about the write endurance for common USB drives or SD cards, I have to assume they are actually typically much less than SATA SSD drives with advanced wear leveling and other tricks.  

All of this conspires together to mean that I should be surprised that the drives lasted as long as they did.  There are a few solutions I am considering:

1. Continue to mount the root folder from USB, but mount it read only.  Mount /var on the SATA device.  

2. Disable swap

3. Filter the system log (this is already done on some machines)

4. Convert all of the machines to  loading root from HDD

5. Use an SD Card instead of USB flash drives to host the root filesystem. Theoretically SD Cards are still flash memory and would have the same issues, however "high endurance" MicroSD cards are available, which would mitigate the issue.  

6. Make sure "noatime" is used.  

7. Configure ResilioSync to store state data on the HDD

I am currently considering a combination of #2, #3, #5, #6, and #7 above.  Lowering the number of writes dramatically while also using a device meant to handle more writes should improve reliability.  

Another potential option would be to use a linux distribution which is designed to be mounted read-only, like the "Live" demo distributions which can be run from CD-ROM, etc.  At the end of the day, though, state data is needed, and that has to be stored somewhere.  

While High Endurance cards are more expensive than normal MicroSD cards, given that the required capacity is quite low, it should not be a significant cost.  

Anyone looking to minimize cost could simply burn the kernel into the eMMC, and put the root on the internal HDD, so that no additional hardware is required.  

In my case, one consideration is that for the machines where home is mounted on the HDD with just a single ext4 partition, there would be no easy and risk free way to add a new partition to hold the root filesystem anyway.  

Also note that since the intiial root device is specified in the kernel command line, which is stored in the U-Boot environment variables, changing this requires either disassembling the machine or updating the eMMC from Linux.  

Although I am selecting MicroSD here based mainly on the fact that I haven't heard of "High Endurance USB Flash Drives", I assume the USB drives marked as SSDs instead of "Flash Memory" are in fact more sophisticated and would last longer.  They are also larger and more expensive than necessary for my purposes.  

A unintended advantage of using MicroSD instead of USB flash for the root filesystem is that assuming we also load the kernel from MifroSD, the USB port will be freed up for external hard drives, additional network dongles, etc.  




Comments

Popular Posts