The cloud underpins so much of what we do at DiUS these days, whether it be building a new digital product, ML solution, IoT experiment, or all those things in one, we thrive in understanding how we can exploit the latest offerings by each of the major cloud providers. So when AWS’s flagship re:Invent event came around again, as always we paid full attention.
Each year, we are treated to a dizzying array of new services and upgrades. If you are like us at DiUS, there are many things from re:Invent we want to try right now, but might not be able to as we’re already busy using an existing stack on a client project.
So we thought we’d ask our DiUS Cloud SiG* organisers, Sagar Jani, Wes Chan and Ken Ong, to come up with a wish list of things to look into. Here’s the list of what they want to take for a test drive, and why.
*At DiUS we have a number of different ‘Special Interest Groups’, the Cloud being but one.
Well-Architected custom-lenses for internal best practices
The AWS Well-Architected tool now offers the ability for customers to create their own custom lens based on their internal best practices and the nature of the workload which makes it easier to maintain the overall architectural health.
With a custom lens, you can now not only create your own pillars, questions, best practices and rules to flag the high/medium risk issues but create improvement plans to remediate identified risks.
For example, you can create your own IoT custom lens based on sensor communication for data generation or/and collection. In fact I know our DiUS IoT practice is developing a custom IoT Well-Architected Review to outline best practices for the design and development of an IoT device. It will evaluate a current IoT device solution for both the hardware and firmware to understand any possible shortfalls in security, stability and consumer user experience.
In my view this feature is very valuable and I can think of so many additional use cases, especially the workloads with regulations.
A new table class for DynamoDB which provides a cost-optimised solution for storing infrequently accessed data with single-digit millisecond read/write performance.
It is ideal for use cases which require long-term storage of data that is infrequently accessed with storage as the dominant table cost. For example, storing customer order history, application logs etc.
This would not only reduce the storage costs by 60% but also use the same APIs which means no change in application code.
However, throughput costs for this new table class are priced 20% higher than the Standard tables.
Now organisations don’t need to develop a process to migrate their infrequently accessed data between DynamoDB and Amazon Simple Storage Service (Amazon S3)
I’m curious to see how this feature would positively impact the customers who need to store TBs of data which are infrequently accessed for several years with high availability.
CDK v2 and CDK Watch
CDK v2 has been made generally available (with the Go version still in preview) to further iterate on this concept. Previously, CDK consisted of multiple packages which increased the effort of dependency management and versioning – ultimately, adding to the complexity of development projects. CDK v2 addresses this issue by using a single published package called “aws-cdk-lib”, bundled with all of the required libraries across the supported AWS services.
CDK v2 additionally introduces CDK Watch, a method of continuously checking and deploying modifications to AWS Lambda handler code, Amazon ECS tasks, and AWS Step Function state machines. This allows organisations to ship code faster and reduces the friction between iterations.
We will be interested to see the steps involved in upgrading from AWS CDK v1 to v2 and the mechanisms behind CDK Watch in action.
Amazon Redshift is a data warehousing solution to rapidly gain insights from structured and semi-structured datasets. Previously, Amazon Redshift nodes would be created in a cluster configuration with each node starting from $0.33 per hour in Sydney and billed based on the uptime of the nodes on a per-second basis.
Launching a Redshift cluster would typically be provisioned and monitored by a DevOps team. Provisioning a Redshift cluster involved manually selecting the amount of nodes and types, scaling and workload management configurations.
AWS has simplified this data warehousing offering through the release of Amazon Redshift Serverless – a fully managed data warehouse solution. Amazon Redshift Serverless, instantly provisions and automatically scales based on the analytics workload. This service optimises costs by only paying for the duration of analytic workloads on a per-second basis instead of the node uptime.
We will be interested to see the performance of Redshift Serverless cold start up times, analysing datasets within the data warehouse and datasets within an Amazon S3 data lake. In addition, it will be interesting to see if there is an upgrade path from preexisting Amazon Redshift clusters to Amazon Redshift Serverless and vice versa.
Amazon Kinesis Data Streams On-Demand
Amazon Kinesis Data Streams is a service that provides a mechanism to stream data in real-time at any scale and provides the mechanisms to gain insights within a matter of seconds through integrations with other AWS services.
Previously, Amazon Kinesis Data Streams would be created and billed based on the number of shards provisioned. The configured amount of provisioned shards dictates the maximum throughput that can be provided through Amazon Kinesis.
If an organisation does not increase the number of shards when experiencing an unsuspecting burst in volume of data, the data stream will generate a provisioned throughput-exceeded exception,thereby resulting in throttled and delayed data events. Similarly, organisations could face increased costs where the number of shards have not been decreased or over provisioned during periods of low volumes of data.
AWS has released a new capacity mode for Kinesis Data Streams called ‘on-demand’. This new mode allows organisations to instantly scale up or down their Kinesis Data Stream based on their current workload.
We will be interested to see how fast the new capacity mode can scale and the differences between this new capacity mode and it’s related service, Amazon Kinesis Firehose.
Amazon SageMaker Canvas
Amazon SageMaker is an AWS service that provides the tools to build, train and deploy machine learning models. This offering has previously targeted Software Engineers and Data Scientists interested in using machine learning to solve a complex problem within their organisation.
AWS have announced SageMaker Canvas, an extension to the Amazon SageMaker suite of tools. SageMaker Canvas provides a no-code approach through a GUI for non-technical staff to create machine learning models with absolutely zero code.
Given our background in building custom machine learning models, we will be interested to investigate how easy it is for non-technical staff to leverage machine learning, the use cases that the service lends itself towards and if the models generated are suitable for prototyping or production workloads.In particular, if the models created are easy to work with and debug.
AWS SQS enhancements for Dead Letter Queue Management
SQS has for a long time supported dead letter queues for capturing messages that have failed processing by the subscribers. However, having a process and tooling around replaying the messages back to the source queue has required some thought and effort; especially with production workloads and permission constraints.
This announcement gives us another option for customers to manage message reprocessing from SQS Dead Letter Queues. We certainly welcome a native approach to managing messages residing in the Dead Letter Queue and something that provides a solution that is supported by AWS and less custom.
There’s no Cloudformation and CDK support at the moment but we’re definitely looking forward to it when support for that lands.
AWS Transit Gateway intra-region connectivity
AWS Transit Gateway has become a preferred way of peering connectivity between AWS VPCs and between accounts in our clients’ landing zone deployments. With new deployments of AWS landing zones, we are able to connect to centralised Transit Gateways which provide a centralised networking topology. In the majority of cases, there is almost no need for multiple Transit Gateway connections and interoperability. This becomes tricky with existing infrastructure and configurations.
I’ll be keen to investigate what it means to how we set up our AWS accounts and architectures; both for new greenfield landing zones and existing ones upgrading to make use of the newer services and offerings.
We’re going to take a look at these in the new year—whether that’s via a tech spike or a more in-depth exploration through a proof of concept—and share the results in a blog series. And we know our ML and IoT practices have a similar long list of new offerings they want to dig into. So expect more on that front soon.