In my previous post I described the fundamental setup of an embedded Linux system. Building on from that, as the next step here I’ll cover how to integrate remote management via AWS IoT Core services.
AWS IoT Core provides several services that are really useful to an IoT device. Services such as the Rules Engine for easily routing data (and changing the routing without having to update the device!), the Fleet Provisioning support for deploying per-device certificates, and of course the Device Shadows for managing device configuration regardless of whether the device is online at the time or not. These make it an enticing option when building an IoT device that will interact with a solution hosted on AWS.
On devices running embedded Linux, when starting out it is often tempting to take the “quick and easy” path, especially for proof-of-concepts where time is limited. This is all well and good, until we consider just how often proof-of-concepts end up deployed to production in a rush to “get something out the door”, at which point the tech debt goes through the roof.
One of the naive (but simple) approaches taken is to simply integrate the AWS IoT Device SDK into whichever main application is run on the device. This is quick, easy and convenient – until the inevitable point where another service on the device also needs to access AWS IoT. This then leads to either the main application turning into the proverbial kitchen sink of functionality, or the other services end up duplicating functionality by connecting to AWS IoT separately. The latter then also becomes more expensive due to the increased number of connections and messages (including keep-alive messages). Neither of these approaches constitute good practices, and clearly they’re not following the Unix way. When in Rome, we should do as the Romans do, as the proverb holds. This is so also on Linux / Unix.
The approach we have used quite successfully across many projects is to have a single application which is responsible for the interface towards AWS IoT. This application in turn then extends those services into Linux userspace. Providing services to other running applications is done in the typical Unix way, in that everything is a file, and what isn’t a file is a process to be spawned. This approach then nets the best of all worlds as there is clear separation of responsibilities and it becomes easy for other applications to also leverage AWS IoT Core features.
Introducing chariotd
After having implemented such a service several times for different clients, it seemed a waste of effort to keep reimplementing nearly the same thing over and over. Therefore, we decided to release an open source version of such a service that can be freely reused. Introducing chariotd – the common handling for AWS core IoT daemon. Written in NodeJS, it is a lean implementation which nevertheless provides access to the key AWS IoT Core features:
- Device Shadow handling
- MQTT message publishing
- Fleet Provisioning
The choice of NodeJS as the runtime (and JavaScript as the language) was driven by convenience and past experience with the NodeJS version of the AWS IoT Device SDK. While JavaScript has plenty of warts, it’s still a good choice when interfacing with web services, and makes it easier to share development knowledge across the device and the cloud.
Device shadow handling
The way device shadow handling works in chariotd is that within the shadow document, each key under reported
or desired
represents a single service on the IoT device. This effectively provides separate name spaces for each service and avoids conflicts with names of individual settings.
Consider an example shadow document:
{
"reported": {
"ssh": {
"enabled": true,
"authorized_keys": [ "ssh-rsa AAAAB3NzaC1yc2..." ]
},
"console": {
"enabled": false
}
}
}
While there are two keys named enabled
, they are clearly separated and without risk of conflict.
To link a shadow document section to something useful in Linux userspace, chariotd uses service definition files. These describe what file to write the shadow data to, what format to use, and how to notify the corresponding Linux service of changes to the data.
For the “console” service above, the service definition might look like:
module.exports = {
key: 'console',
outfile: '/config/console.rc',
outformat: 'SHELL',
informat: 'SHELL',
notifycmd: 'sv restart console',
}
Thus whenever a value changes under console
in the device shadow, the updated settings will be written to /config/console.rc
after having been formatted into a shell-compatible format, and then the “console” service is restarted.
Another common action is to send SIGHUP to the service in question, rather than restarting it fully, assuming the service supports that functionality.
In addition to receiving change requests via the device shadow, services can themselves also post updates to the device shadow document. In most cases this is not necessary, but it can be useful at times, for example when the device shadow contains a representation of attached peripherals and those get added and removed locally on the device. When a service posts update requests, they are interpreted according to “informat”.
Shadow updates received via the “desired” branch are automatically merged in by chariotd and the result applied to the shadow document without the service having to perform said update.
MQTT message publishing
Message publishing through chariotd is about as simple as it gets. Put a JSON file containing the topic and the payload in the right place, and up it goes. Technically it’s a two-step process – first the file content is written to a temporary file, and then the temporary file is moved into place. This has the effect of guaranteeing that the full content of the file is available when chariotd sees the file come into existence.
A brief example to show this:
# cat > /data/chariotd/messages/tmp/hello << EOF
{
"topic": "/mycompany/cooldevice/v1/example",
"payload": {
"hello": "AWS IoT"
}
}
EOF
# mv /data/chariotd/messages/{tmp,new}/hello
Messages aren’t removed from that directory until they’ve been actioned, so if overwriting is a concern consider using a timestamp as part of the name. Similarly, also consider using unique prefixes for each service publishing messages to avoid having them step on each other’s toes.
One thing worth pointing out is that at the time of writing this article chariotd has no support for listening to MQTT topics and relaying that data outside (other than via the device shadow, that is). It is generally not a good pattern to have IoT devices, which by their very nature have intermittent connectivity, listen to and act on requests that aren’t delivered over a reliable mechanism. In the case of AWS IoT, having devices subscribe to non-shadow topics can also increase the necessary bandwidth for the devices, as well as overall cost.
That said, with AWS IoT now supporting persistent sessions on MQTT, there might be an argument to be made that it provides sufficient delivery reliability for certain use cases. Do be aware that the sessions are still subject to timeouts, so the reliability is nothing like what you get through the device shadows.
Fleet provisioning
Managing the provisioning of unique device certificates can be a challenge, but the AWS IoT Fleet Provisioning can make it considerably easier. Using factory-loaded credentials chariotd can go through the fleet provisioning process to obtain a unique device certificate. Not only that, the fleet provisioning process can be used for certificate rotation/renewing as well, with chariotd needing only to be requested to run through it again.
A full example is outside the scope of this article; please refer to the AWS fleet provisioning documentation together with the section on fleet provisioning in the chariotd project, for full details on setting up fleet provisioning. AWS also has an article with a step-by-step example – in “Procedure 3” you can use chariotd instead of the python code listed there.
Integrating with chariotd
Rather than just reiterating examples already available on the project page, let’s have a look at what is required to integrate chariotd and put console and SSH logins under remote control via the device shadow. This is a deep-dive straight into the implementation details. If this is not of interest right now, feel free to skip this section and continue reading at Provisioning Considerations instead.
The practices outlined in the Embedded Linux Fundamentals article are assumed to have been followed, so the root filesystem is assumed to be read-only, all configuration data is assumed to reside under /config
, and transient data under /data
. Further, the service monitor of choice is runit with service units defined under /etc/sv/
.
In this instance we also make the assumption that a unique device certificate has been provisioned to the unit as part of the factory end-of-line process, so that a certificate store with a valid device certificate is found under /config/chariotd/certs
.
Service script for chariotd
Launching chariotd is quite straight-forward. In our case we base our MQTT client id off the CPU serial number, though any other permanent device-specific value could be used (e.g. MAC address). Further we also ensure the watch directories have the necessary subdirectories created.
/etc/sv/chariotd/run
:
#!/bin/bash
DEVICEID="$(grep Serial /proc/cpuinfo | awk '{print $3}')"
SHADOWDIR=/data/chariotd/shadow
MESSAGESDIR=/data/chariotd/messages
mkdir -p \
"${SHADOWDIR}"/{tmp,new,failed} \
"${MESSAGESDIR}"/{tmp,new,failed}
exec /usr/bin/chariotd \
--clientid="${DEVICEID}" \
--cacert=/config/chariotd/AmazonRootCA1.pem \
--certstore=/config/chariotd/certs \
--services="${DEVICEID}:/etc/shadow.d" \
--updates="${DEVICEID}:/data/chariotd/shadow" \
--messages=/data/chariotd/messages \
2>&1
Enabling logging under runit is done easily by adding another run
file:
/etc/sv/chariotd/log/run
:
#!/bin/sh
LOGDIR=/var/log/chariotd
mkdir -p "${LOGDIR}" 2>/dev/null
printf 'n1\ns10000\n' > "${LOGDIR}/config" # limit 10k log, 1 old file kept
exec /usr/sbin/svlogd -tt "${LOGDIR}"
Example console service
Our “console” service is handling our serial login ability. Normally the getty
is started from /etc/inittab
, but by moving it into a regular service we can easily control it via the device shadow.
Here we’re also showing the use of shellcheck comments for controlling linting of the shell script.
/etc/sv/console/run
:
#!/bin/sh
CFG=/config/console.rc
# shellcheck disable=SC1090
[ -r "${CFG}" ] && . "${CFG}"
# shellcheck disable=SC2154
if [ "${enabled}" != "true" ]
then
sv down console
exit 0
fi
exec /sbin/getty -L -i ttyS0 0 vt100
We have already seen the service definition file, but once again for completeness’ sake:
/etc/shadow.d/console.js
:
module.exports = {
key: 'console',
outfile: '/config/console.rc',
outformat: 'SHELL',
informat: 'SHELL',
notifycmd: 'sv restart console',
}
Those two files are all it takes to control the ability to log in on the serial (ttyS0
) console via the AWS IoT device shadow!
Example ssh service
Our example SSH service uses dropbear SSH, a common choice on embedded Linux. We make sure that we only permit key-based logins to side-step the issue of brute-forced passwords.
The SSH configuration handling makes use of the command line JSON processor jq, which is the “Swiss army knife” for all things JSON in Linux.
Normally all configuration for dropbear is expected in /etc/dropbear
, but since our root filesystem is read-only, /etc/dropbear
is simply a symbolic link (symlink) to /config/ssh/
.
The service script for the SSH daemon is rather simple:
/etc/sv/ssh/run
:
#!/bin/sh
mkdir -p /config/ssh
enabled=$(jq -r .enabled /config/ssh/config.json 2>/dev/null)
if [ "${enabled}" != "true" ]
then
sv down ssh
fi
exec /usr/sbin/dropbear -R -F -E -s -g -T 3 -K 60 2>&1
Meanwhile the log script is nearly identical to that of the chariotd service:
/etc/sv/ssh/log/run
:
#!/bin/sh
LOGDIR=/var/log/ssh
mkdir -p "${LOGDIR}" 2>/dev/null
printf 'n1\ns10000\n' > "${LOGDIR}/config" # limit 10k log, 1 old file kept
exec /usr/sbin/svlogd -tt "${LOGDIR}"
The chariotd service definition this time is using JSON format the config file:
/etc/shadow.d/ssh.js
:
module.exports = {
key: 'ssh',
outfile: '/config/ssh/config.json',
outformat: 'JSON',
informat: 'JSON',
notifycmd: 'sshconfig-helper',
}
Because we need to have the authorised keys in a custom format, our “notifycmd” calls a small helper script which extracts said keys into the expected authorized_keys
file, before restarting the SSH service:
/usr/sbin/sshconfig-helper
:
#!/bin/bash
jq -r .authorized_keys[] \
< /config/ssh/config.json \
> /config/ssh/authorized_keys.tmp \
2>/dev/null \
&& mv /config/ssh/authorized_keys{.tmp,} \
|| rm -f /config/ssh/authorized_keys.tmp
sv restart ssh || exit 0
The above is all it takes to place the SSH daemon under remote control via the device shadow. The SSH service can be enabled on-demand, and authorised keys added or removed as needed. The default state of disallowing logins helps improve security. And when it’s this easy to make things more secure, there is little excuse not to do so!
Provisioning considerations
In the above scenario we worked on the assumption that each device already had been provisioned with a unique device certificate in the factory. Unless you are well on top of your factory processes, that can be a dangerous assumption to make. A common issue we see is that IoT devices have ended up deployed to production with a device certificate which is shared across all devices. This is a major security issue, and is often compounded by a lack of encrypted storage on the devices themselves. For IoT devices that have gone from proof-of-concept to production without much in-between, this can be a real challenge as there is often no mechanism built in to support certificate rotation/renewing that would make it possible to roll out new certificates. Avoid falling into the single-certificate trap by planning for it from the outset!
When generating certificates there are some deciding factors to take into account:
- Who will generate them
- Where will they be generated
- Whether the implied access requirements are acceptable
Consider the case where unique device certificates need to be generated as part of the factory end-of-line process. This would imply that workers at the factory, by some way or another, have sufficient credentials to provision new devices. A business risk assessment should be done to ensure that’s acceptable, and/or what mitigating strategies should be used.
In the case where devices leave the factory without being provisioned, the certificate generation and provisioning can be kept in-house, with an implied higher degree of control over the credentials used. The downside of course is that increased in-house work is required.
The AWS IoT Fleet Provisioning by claim is another alternative, where a preloaded, shared bootstrap certificate is used to authorise the device itself to request a unique certificate. In this scenario it is also easy to restrict when they may do so. For example, the claim certificate may only be active during the manufacturing window and deactivated all other times, thus reducing the exposure of the credentials.
There is no single one-size-fits-all solution in this area, and it is a combined business and technical decision to be made to pick a suitable approach for the device in question.
If your IoT device supports encrypted data storage, it is highly advisable to store your device certificate(s) in that area. Depending on the specifics, it might be a case of only keeping the certificates in RAM for chariotd to use, with the necessary decryption/encryption added to the service script and fleet provisioning hook, respectively.
Summary
When wanting to use AWS IoT Core on an embedded Linux device, good design principles are:
- Avoid integrating the AWS IoT Device SDK directly into the main application
- Don’t duplicate AWS IoT connectivity across multiple applications
- Avoid using inbound MQTT messages for application control, due to lack of a fully reliable delivery option
- Ensure each device has its own unique certificate
- Support certificate rotation/renewing on the device so it can gracefully handle certificate expiration
- Where possible, keep the certificates in encrypted storage on the device
And of course, consider using chariotd to get a leg up on many of those points. Pull requests welcome!