In Part 1 of this blog post, I discussed the importance of recommender systems for businesses wanting to stay relevant in such a competitive environment. Furthermore, I covered how Amazon Personalize fits in the recommender system landscape.

This post will explore how Amazon Personalize is helping to accelerate the machine learning lifecycle and where we think the challenges are for the important topic of automated deployment.

So, what’s left to be solved?

At DiUS, we always look for challenges in automation and testing when building solutions for our clients. Automated deployment and testing are key to any successful development process, and are particularly important for reproducible machine learning experiments. 

Typically for new and experimental AWS services, the automation framework support lags behind. Looking at the service itself, Amazon Personalize supports the creation, management and deployment of models (solutions) from the UI, however there is no CloudFormation support.

So, where do we see the biggest challenges for organisations in successfully using Amazon Personalize?

To answer this question, we will break the typical development workflow into the three sections—Data Preparation, Model Training and Model Deployment.

1. Data Preparation

The requirements for the Amazon Personalize dataset(s) give us a good indication as to what data will be needed. However, in our experience, transactional user data which may include purchase history, combined with the customer’s online profile (web click events) is typically not accessible in an ad-hoc manner, or worse, is distributed across multiple data sources. 

When it comes to data preparation, we think it’s important to get started as simply as possible. This will allow you to get used to the process of data extraction and build an understanding—over multiple iterations—of how complexity impacts the recommendation performance.

The simplest model we can build with Amazon Personalize is purely based on user-item interaction, matched against a timestamp which allows the model to understand the order of events.

We recommend you start with a small dataset to get the extraction process automation right before you move to a larger dataset. This will allow you to iterate much quicker over the data preparation initially, and reduce complexity which may come with a larger volume of data. 

Amazon Personalize recommends a minimum of 10,000 data points, with at least two interactions per user. For example 5,000 customers with two transactions each or 1,000 users with 10 transactions each. 

Timely access to the required dataset

Organisations should make sure datasets can be produced without an extensive amount of manual effort, as recommendations have an ongoing dependency on up to date interaction datasets. Ideally you want to run your data exports automatically, as part of the pipeline.

Feature extraction and dataset cleanup

Exporting your dataset should be done or supported by a data engineer/scientist who understands your dataset. 

Feature extraction is a required process to find the relevant data points in your dataset. It could include building average, min and max values in relevant datasets. However, before you can generate useful and correct features, data cleanup tasks are required.

Typical data cleanup tasks include filling in missing data (which often leads to the question/investigations, why is data missing in the first place?) and argument data with additional information (eg. weather data, communication channel etc), as well as removing unnecessary columns and sparse columns where there is simply not enough data available.

Feature extraction as well as data cleaning should be performed programmatically, as part of the data export process.

Define the Amazon Personalize schema

  • Defining the schema is straightforward and well documented.
  • Rename column headings and map data types to the Amazon Personalize schema.

Export data

  • Export prepared dataset into csv files into the S3 bucket.
  • Consider using a structure which allows multiple versions of the dataset to be exported at the same time.

If you have the infrastructure in place which allows you to define a reproducible workflow to achieve the required steps, great! If you don’t, you might want to consider starting with a proof of concept workflow, and potentially using jupyter notebooks. Amazon provides a number of great examples to get you started. Whilst notebooks are somewhat interactive, they can be easily automated or exported in code and moved to Lambda or AWS Step Functions.

Dataset creation automation

The dataset creation involves a couple of steps which require sequential execution. To avoid initial complexity, we choose not to update the existing Amazon Personalize datasets, but rather create a new dataset every time we execute the process. 

Furthermore, we assume each automation run will produce a new dataset group, schema and dataset. This will reduce the complexity of managing the state when using custom resources.

Our dataset-creation custom-resource is a lambda function which is triggered from our CloudFormation template. We can drive the custom resource with configuration parameters such as dataset-group name, dataset name, schema uri and dataset uri.

Each custom resource implements three functions: create, update and delete. Those functions correspond to the lifecycle of a CloudFormation stack. 

2. Model Training and evaluation

The model training of Amazon Personalize is relatively straightforward. For simplicity, we exclude model fine tuning steps and focus only on the dataset and recommendation model configuration.

The automation again is another custom resource (see image above) or directly using the aws-cli command line.

The key challenge here is that Amazon Personalize does not yet support getting a trigger/message when the dataset import or solution creation is finished. A workaround is implementing a `while busy loop` checking the resource creation status until those resource creations are completed. This is a good starting point, however, it may bind build-agent resources for multiple hours each time a model is training. 

A more complex approach could be by using a CloudWatch trigger to check for job completion and then trigger different build stages on the CI/CD build server using web hooks.

Evaluation of the model performance has to be done once a model baseline has been established. For example, you could quality check the performance against the current best model to support the selection process. Additional model evaluations should always be done on a separate dataset called a ‘validation set’ which has been excluded from the training dataset.Typically this validation set is the most recent data in your time series dataset.

3. Model Deployment

Before we can generate recommendations, we have to deploy our model endpoint which can be done with a few lines of code using the AWS SDK.

For consistency, we wrapped the creation in yet another custom resource. Similar to the solution creation, this will typically take a few minutes to provision the resource, but has a more predictable runtime. To deploy a model version we simply need the ARN (Amazon Resource Reference) of our model solution and a parameter defining how many requests per second we want to provision for.

Generate recommendations and application integrations

Generating recommendations can be either a simple HTTP request or wrapped application code using the Amazon Personalize nodejs/python SDK. Each model type has a slightly different API depending on the solution which is described in detail in the documentation.

Most recommendation types require a user identifier known to the model. In some cases, a list of choices has to be passed to Amazon Personalize in order to rank based on learned user preferences.

In general, you can generate recommendations in two ways: 

  1. Synchronous real-time as an API request (per user)
  2. Asynchronous batch predictions (list of users)

Both prediction types require user references as input. Which one should be used depends on your specific preference for generating predictions. It’s considered best practice to cache prediction results for quicker API access and cost reduction depending on your use case. The following questions might help guide the decision process. 

  • How often do you need to update the predictions (daily, hourly)?
  • How often does new content/items arrive?
  • Do you need recommendations for all users, every time?

How to keep predictions up to date 

If we are looking at the technical implementation of the HRNN Algorithm used by Amazon Personalize, we see under the hood that the most recent user interactions are input to the model predictions. That also means our knowledge about the user is frozen at the time of exporting the data used for training.

This leads to two important observations.

  1. We need to train the model on the most recent user data to get the best results!
  2. We need ongoing fresh interaction data to keep the system working efficiently, without retraining.

Amazon Personalize is solving the second challenge by allowing you to add ‘recording events’. This will provide an interface to continuously push new events to your dataset. However, this data is stored encrypted against a particular dataset and you will not be able to export those events later.

The first challenge is a bit tricker. To our knowledge there is no way to update data for new items or users. Consequently, if we had new users and/or new items being added to our catalog, model retraining with an updated dataset would be required. There is no one simple answer to how often model retraining is required. The answer is a trade off between model performance degeneration (due to the “cold start problem”) and the cost to retrain the model.

Beyond training an accurate model, we also want to be efficient with resources. To reduce training time, consider experimenting with how many events per user should be provided or use sliding windows (3 months, 6 months, 12 months) so that you don’t train the model on outdated data and reduce the training time and cost.


Having used Amazon Personalize, we can clearly see that the target user group is not aimed at mature ML/AI organisations, rather at businesses who may not have an extensive background or knowledge of recommender systems. However Amazon Personalize does provide all the tools which enables complex fine-tuning for experts. The key challenges for businesses we identified will be around automation of deployments, data analysis and the careful fine-tuning of the model retraining cycle, which together requires a diverse knowledge of cloud DevOps, data engineering and machine learning. 

Make no mistake, AWS Personalise is a definite value-add offering, but for it to reach its fullest potential, you will need to bring some expertise in data management and AWS automation.