Buzz has been building over the past couple of years about serverless websites and applications. But what does serverless even mean? What implications does it bring and how might you harness the benefits? Is it a good fit? What would you have to change to support and facilitate a migration and pivot?
Defining the beast
Serverless doesn’t mean no more computers or servers at all (that’s not how the internet works). The main point is to minimise reliance on web servers – you know, the computers that sit around, idling and waiting for requests from clients.
Some of these are on the front end, rendering HTML and talking to other computers. Some of these are other computers (all the way to the back-most ends), which probably talk to more computers. Some of them are database servers, file servers and authentication providers etc.
All of these computers have specifically provisioned sets of software on them. They have network configurations, users, agents, and daemons. Then there’s the rest of the infrastructure; routers, load balancers, network address translation layers, certificates, tunnels, funnels, and sinks. Someone has to decide how that all hangs together and build it. In many organisations (but not as many as you’d hope) this is all automated, but the layout still has to come from somewhere.
If we could just get rid of all that overhead, you could go right ahead and get on with the important, useful work. You cannot, however, just get rid of it. Everything will stop working. So what can we replace it with?
A newer beast
If you come from a functional or mathematical background, this should be quite intuitive. What do your webservers even do? Given some inputs, they yield some outputs. They’re pretty deterministic (or your software is unpredictable, maybe even – dare I say – buggy).
What you’ve usually built is a moderately-coupled set of multi-responsibility impure functions. This flies somewhat in the face of some fundamentals of most programming paradigms. Encapsulation, single responsibility, loose coupling, data-flow directionality – you name it.
Imagine replacing all that with single-responsibility, standalone functions. Forget about computers, operating systems, and hardware.
You can now treat functions as the smallest granule of, for want of a better pun, functionality. AWS has Lambda, Google has Cloud Functions, Azure has Functions and IBM has OpenWhisk. They don’t all quite have the same offering.
But what does it mean?
So now you can write some code, stick it in the cloud and let someone else worry about scaling and concurrency. AWS Lambda functions are billed per 100ms, so there’s not much to worry about there, just keep an eye on your bill.
You can make the functions as small or big *almost* as you’d like, but there’s a time limit per invocation (so probably don’t try to put your 10 minute CI steps in there).
What can it do?
Let’s look at some examples, because this has all been pretty abstract.
Disclosure: I’ve only worked through (or theorised) these applications in AWS, so I’ll stick with their terminologies for consistency.
Content (rendering, static or dynamic, background)
Say you have a content-driven site. Maybe a blog platform, or news media site. The main portion of your web outputs are relatively static.
Instead of making a computer regenerate the exact same markup every time a particular piece of content is requested, you could render it out once, and leave it on S3. This saves you in EC2, VPC, Route53 and costs (S3 is super cheap for this use case). It also enables easy control and configuration of caching, more region availability and other CDN options.
And if you ever want to change that piece of content, it can just be rendered out again, on demand.
There’s a little more work than that though. You will need to work out how to route to the new content location. Actually, that’s probably all. Now your presentation layer is separate from your authoring layer. Now you’re free to change things about your workflow that used to be constrained by this coupling.
It’s worth pointing out that you don’t need cloud functions to achieve this, but they fit in very well. Use a function on-demand (instead of a full-time web server) to render out the content.
Non-content (async short jobs)
Maybe you don’t deal primarily in content. Perhaps part of your service is more single-use, but repeatable and high-demand. Something like a proprietary image or sound bite classifier. You could use TensorFlow, but you’re already down another path, and it doesn’t give you the flexibility you need.
Jam it in Lambda. Whack some API Gateway endpoints around it. Now you have a highly-scalable service with no manually provisioned infrastructure.
Integration with non-web-server services
Of course, your cloud functions don’t have to be completely isolated. Lambda functions get permissions through IAM so you can give or restrict internet and other AWS resource access.
Maybe you have content in S3 or ElasticSearch. You can pull it out, transform it, put it back, put it somewhere else, send it to another API Gateway endpoint . . . whatever you like.
Another topic with rising popularity recently – service bots that interact with users through natural language. Latency is even less of an issue. Bots that talk back too fast are creepy.
Don’t limit yourself to text, though. If you already have some AI bits trained up, they can sit waiting for activation by your bot. Send them text, images, audio, whatever their use case might be. Process it, send back the response and they can go back to sleep.
If these concepts sound a bit like microservices, then good. Scalability, distribution, bounded contexts, and single responsibility are all facilitated by cloud functions, and removal of always-connected moving parts.
Microservices are a back-end pattern for a reason, and cloud functions are perfectly suited for such stateless, short-lived jobs. Join little bits of work together with small, well-defined, well-tested interfaces.
What can’t it do?
AWS Lambda, at least, has latency constraints. The runtime for a Lambda function is held in a Linux container. To scale, more containers are spun up and down, and that has a non-zero time cost.
So maybe it’s not ideal for front-end web-browsing content rendering. I’ve seen start-up latency of 10 seconds. That’s not really acceptable.
There’s also multi-tenant resource contention. If too many other people want to run Lambda functions, yours may be evicted. This also increases overall latency.
And there are the other limits (as linked above). But play around with it, see what constraints you hit. If it doesn’t quite work for your use case, at least you have a nicely encapsulated function that you can put in an SQS worker.