Here at DiUS, we often get brought on board to help productise “IoT” (that’s Internet of Things, if you have somehow managed to escape the acronym) devices, and depending on the domain it’s often a Linux based device. Something we commonly see is that developers who are entering the embedded Linux space from the server or desktop direction are carrying over patterns from there out of habit.
While a server or desktop Linux distribution is well suited for their respective areas, they’re designed around assumptions that don’t hold well in the embedded space. The trade-offs made in environments that have an active user presence are not well matched to the needs of an unattended embedded device. For embedded devices, high resilience and automatic recovery are typically top priorities, as there is no user around to intervene to resolve issues. As such, the best practice patterns for embedded Linux systems more closely match those used in microcontroller environments than server/desktop environments.
The root filesystem
In most cases, embedded Linux devices have to cope with being unpowered unceremoniously and at inconvenient times. This means that buffered filesystem writes might be lost and if really unlucky, the file system(s) end up corrupted. The safest way to avoid this of course is to not write in the first place. The key tenet with embedded Linux systems is that the root filesystem is read-only. This not only side-steps the issue of filesystem corruption, but also enables a host of other features.
One such feature is integrity verification – the entire OS can be validated at boot and runtime, which guards against both accidental changes as well as malicious attacks. If the hardware supports a secure boot scheme (e.g. ARM Secure Boot) it’s also possible to establish a full chain-of-trust using cryptographic signatures, preventing any unauthorised code from running on the device. This can be invaluable for security sensitive devices and is only possible as long as the root filesystem is read-only, as any write to it would break that integrity.
A related feature is that some settings can thus become immutable within a firmware version. This can be useful for many things, including security settings such as disallowing password-based logins.
The other major feature that follows on from using a read-only root filesystem is that of atomic upgrades. While on a server or desktop distribution you would perform incremental updates, package by package, on an embedded Linux device, you want to avoid that at all cost. With incremental upgrades the system can be left in untested (and unstable) states if there are interruptions during the upgrade process, and in the worst case, may be left inoperable and unrecoverable. If you have ever rendered your system unbootable after upgrading the libc package, you know what I’m talking about. More on upgrades later.
Naturally, with a read-only root filesystem you also greatly reduce the testing scope for the system – for any given firmware version you know precisely which components are included, and you also know that only settings which you explicitly have made configurable may change. The savings in time and effort in this area should not be underestimated, and it also means it is easier to more comprehensively test and thus raise the bar on quality.
Keeping configuration data safe
Once your root filesystem is read-only, you obviously need somewhere else to keep configuration data. For that, a dedicated small read-write partition is used for all configuration data. This partition only sees a limited amount of write activity, and thus has a low risk of encountering filesystem corruption. Coupled with the standard pattern of “write updated settings to new temporary file, and then atomically move the new file over the old one”, you achieve near-guaranteed safety of your configuration data. If possible, using a journaled filesystem further helps keep your bits safe, and if you don’t need to know the last access time of files, mount using the “noatime” option to further reduce writes to this filesystem. It should be stressed that this filesystem is only for configuration data. Resist the temptation to place more frequently updated files on this filesystem, or you will be reducing your fault resilience needlessly.
Unless you are using hardware with encrypted storage facilities, this partition is typically also where any device specific credentials will be installed as part of the end-of-line process in the factory.
Storing transient data
In addition to your mutable configuration data, you will undoubtedly also be handling data of a more transient nature. This includes any and all samples and telemetry the device might be processing, and any data which is subject to store-and-forward handling. All transient data is kept in yet another dedicated partition. This partition sees a lot of write activity and as such the most likely to encounter failures. Keeping this away from the actual OS files and configuration data is key, as even a catastrophic loss of data on this partition would not render the system inoperable.
In the case of this filesystem ever becoming unrecoverably corrupted, the system has the option of treating that data as expendable, and can as a last-ditch recovery method wipe the partition and lay down a fresh filesystem. That way the system has a path to self-recovery even under what might otherwise be a dead-end for it. This recovery approach would of course be disastrous if applied to a filesystem containing critical configuration data, which is another reason to ensure configuration data is partitioned off into its own partition.
Firmware upgrade basics
So far we have discussed three partitions/filesystems – the root filesystem, the configuration data filesystem, and the transient data filesystems. If we return to the discussion about firmware upgrades, it will become apparent that we need a fourth partition as well. The way firmware upgrades are handled is through an A/B partition approach, also known as “flip-flop” or “active-inactive”. At any given time, one of two root partitions is marked as the active one and is used for booting the device. When performing a firmware upgrade, the new firmware is downloaded to the inactive root partition. Once downloaded and verified, that partition is then marked as ready-to-test and the device rebooted.
Glossing over the details of the necessary boot-loader logic, the new firmware gets selected and is booted once. It is then up to this firmware to perform whatever sanity checks are needed before either declaring itself stable and marking itself as valid to boot again, or declaring itself unfit for duty and roll back to the previously running firmware instead. Typically there is a time limit imposed on a new version to make the decision, so that an absence of an affirmative decision is considered a decision in the negative and thus triggers a roll-back. That timeout may or may not include using a hardware watchdog timer. For extra resiliency the boot-loader would be setting up a hardware watchdog before handing over control to the kernel, so that in the unlikely event of the new firmware failing to even boot then control is returned to the boot-loader via a watchdog induced hardware reset.
Note that since there are independent root partitions for each of the firmware versions, the old version is kept safely intact and can be reverted to at a moment’s notice. This is how you easily achieve atomic upgrades in an embedded Linux environment. It is precisely the same approach as is used in the microcontroller world. There are of course variations to the scheme, such as keeping a third root partition for the factory “golden image” in case the device needs the ability for a complete reset-to-factory-state, but the common practice is the simple A/B solution. The basic partition layout of an embedded Linux device thus looks like:
+------------------------------------------------------------------+
| config | root-A | root-B | data |
| e.g. | e.g. | e.g. | e.g. |
| 50MB | 250MB | 250MB | 7400MB |
+------------------------------------------------------------------+
The partition sizes above are vague generalisations, of course, but are indicative of real-world examples. The configuration partition rarely hosts more than a megabyte of data and the root filesystems are counted in hundreds (or maybe even just tens) of megabytes compared to a server or desktop distribution where it’d be counted in gigabytes. The partition for the transient data storage generally takes up whatever space is left.
If you are wondering how a Linux system could possibly be so small, keep in mind that an embedded device is not a general purpose device, and as such much/most of the software included on a server or desktop is not needed. This brings us to the next important topic – that of building the firmware images in the first place.
Firmware image building
Best practice is to build the whole firmware from source and generate a root filesystem image. This offers several benefits:
- Total version control – no minor package differences depending on when a package was installed to deal with, which helps reduce testing complexity.
- Complete visibility of all code running on the device. In environments with strict audit requirements this can be a must.
- Ability to easily patch OS libraries and executables, e.g. to apply security fixes before an upstream vendor gets around to it.
- Only a minimal OS needs to be built, making over-the-air upgrades faster and cheaper, and also creating a much smaller attack surface for malware and such.
- No reliance on upstream package servers (older package versions are frequently rendered unavailable, in our experience).
- When kept under version control, a tagged version can be used to reproduce the build results, and a single version number identifies a complete firmware.
These benefits do come at a cost – the build time is significant. A typical clean build takes ~20min on a modern desktop. It’s a trade-off that is well worth it however, and during regular development incremental builds can be used where only the changed component/library is rebuilt. What we tend to recommend is to use mostly incremental builds locally, but having the build system configured to always do clean builds to ensure there is no cruft building up accidentally. The build pipeline thus generates reproducible results every time. Provided you also cache the source packages used, you are guaranteed you can go back and rebuild any previous release.
There are a few build systems available for generating embedded Linux OS images; the two most common ones are buildroot and Yocto. At DiUS, we’re longtime users and proponents of buildroot due to its comprehensive documentation, active developer community and the fact that it’s built on standard tools – “make” and “kconfig”, which means the learning curve is fairly easy.
Summary
These are the broad strokes of creating quality embedded Linux systems, and it’s my hope that this overview will have been of use, especially to people from a server/desktop background who might not otherwise have been introduced to these concepts.
So, to recap, best practice patterns for embedded Linux devices include:
- Read-only root filesystem, which:
- Avoids risk of filesystem corruption*
- Enables integrity verification of the firmware
- Ability to have immutable settings (e.g. security settings)
- Allows atomic upgrades
- Reduces testing scope needed
- All configuration data resides in a dedicated partition
- Keeps configuration settings as safe as possible
- Transient data is kept in another dedicated partition
- Keeping non-critical data separate enhances resiliency of overall system
- Perform atomic upgrades by using an A/B partition scheme
- Greatly improved resiliency compared to incremental upgrade
- Enables fallback to previous version if needed
- Build the full firmware from source
- Total control over what get included in a firmware image
- Improved security by minimising included components
- Ability to patch any part of the OS (or kernel)
*Technically, certain flash storages are theoretically able to cause corruption to a read-only filesystem if there is also a read-write filesystem on that media. In practice, I have never encountered it.