How can you make recommendations for a user you know nothing about? This was the challenge posed by a hotel booking site with low conversions from searches into accommodation bookings. Enter DiUS to develop a proof-of-concept using machine learning in a custom recommendation engine to lift the conversion rate and drive more bookings, all without that essential user information that’s traditionally used to power sophisticated recommendations.
Intelligent recommender systems
Recommendation engines can have a major impact on page conversion and purchases on sites where anonymous or unidentified users are searching and browsing a wide range of content, products or services.
High-performance recommendation engines employ a technique known as collaborative filtering, which finds similarities between products based on similar interactions of users. “You liked Tiger King and Queen’s Gambit on Netflix; other users who liked Tiger King and Queen’s Gambit also liked The Haunting of Hill House.” With sufficient data, collaborative filtering models perform better than content-based models but they don’t work well with new products or new/unknown users.
In this case, the baseline conversion rate was low due to most of the users being unidentified at the time they ran their search and started an engagement. Most recommendation engines need to know something about the user. Typically, when a user is logged in you have an identity, some profile information and a history of the user’s interaction with your service, product or website.
No user data? Leverage historical interaction data
DiUS consultants Nigel Hooke and Shahin Namin, Machine Learning Engineers, solved this problem by building a proxy for users and their interactions. For those of you who like to take a look under the hood, the detailed approach is described below, but in short the team replaced user interaction data with property clusters.
The property clusters were built from query data (user searches), based on features such as the number of adults, whether or not there were infants, the number of stay days, a request for a pool and so on. Using only this new clustering approach produced an uplift of 48X the existing model, which had originally been created using a manual decision tree.
Run experiments to find the best conversion uplift
With this new foundation in place, the team next implemented a range of recommendation architectures (6), which were evaluated against the benchmark using three (3) measures of ranking quality: Precision@20, Normal Discounted Cumulative Gain (NDCG) and NDCG@20.
The @20 refers to the ranking of a property in the first 20 results, ie. was the property on the first page of results?
All 18 combinations (6×3) outperformed the benchmark (see the detailed table of results below). The best result was the Biased Matchbox model, which showed an uplift of 308X the benchmark, when evaluated using Precision@20. The uplift for the Biased Matchbox model using the NDCG and NDCG@20 criteria were 130X and 187X, respectively.
While this project was a six-week proof of concept that is not yet in production, the work was successful in demonstrating that a recommendation engine could be developed without personal user data and without being able to identify who a user is. Using historical interaction data (clicks, bookings and property features), the prototype delivered a substantial uplift in performance for all of the recommendation methods applied.
The detailed technical approach
Traditional recommendation engines use content-based filtering, collaborative filtering or a hybrid of both.
Content-based filtering uses data and features such as genre, actors and directors for movies; to be effective, it requires a detailed catalogue of product along with a reasonable history and identity for each user. Results are generally acceptable but not overly impressive.
Collaborative filtering finds similarities between products based on similar interactions of users, as described above. It generally produces better results than content filtering, but it doesn’t work well with unknown users.
Hybrid filtering combines content-based and collaborative filtering, providing a best of both worlds but it can become complicated very quickly.
Back to the problem of the unknown user.
The data we had access to was reasonably rich: 1.5 billion records of click data, one million records of bookings, and 70,000 records of property features.
Core to collaborative filtering is the User-Interaction Matrix, a table where each row represents a user cross-referenced against a column for each item. Each cell is populated with either 1 or 0, with a 1 representing a purchase, click or interaction, and 0 being no interaction. In our case, the client does not know who the user is, so we need to find a proxy.
Using clustering techniques (K-means), we optimised property similarity into 100 clusters, then built a classifier that finds the most likely property cluster for each query. This Cluster-Property Matrix replaces the User-Interaction Matrix, which gets around the unknown user problem.
The classifier was built using Catboost, drawing on a range of query features such as the number of adults, whether or not there were infants, the number of stay days, a request for a pool and so on. Previously, properties had been grouped into nine segments via a manual Decision Tree: kids (y/n), weekend (y/n), more than two nights (y/n) etc.
Using Precision@20 to measure performance, the Catboost Classifier provided an uplift of 41x the benchmark (conversion 6.6% vs 0.16%), highlighting the benefit of this method versus intuition and a best guess approach.
(Note: Precision@20 is the property that is booked was shown in the top 20 listings for that query, ie. the first page of search results)
With the Cluster-Property Matrix used to overcome the problem of unknown users, we then applied a range of methods for Recommendations:
- Popularity:
- Sort the properties for that cluster from highest (most popular) to lowest, then display the top 20;
- Simple and fast but excludes new properties and properties that are rarely booked.
- Term Frequency-Inverse Document Frequency (TF-IDF)
- A method used in Natural Language Processing (NLP) that defines how relevant each feature is to the cluster.
- In our case, Terms are query features (Location, infant, pool, wifi etc) and Documents are property clusters;
- Property similarity: K-Nearest Neighbour (KNN) Recommendation
- For each property, find similar queries by their features, weight the similarity (0-1), then sort the clicked properties by their weighting.
- Biased Matrix Factorisation
- Reconstruct the interaction matrix by multiplication of two smaller matrices to enforce interdependency and remove biases.
- Biased Match box
- A hybrid approach that combines Biased Matrix Factorization with the incorporation of users and item features.
Interested in how our Machine Learning experts could implement a similar project for you? Get in touch.
Evaluation and results
The models were evaluated using test bookings data from Melbourne, Gold Coast and Brisbane – which was set aside from the original data set. Three methods were used to assess performance:
- Precision@20
- NDCG@20 for the first 20
- NDCG not constrained to the first 20 results
- Normal Discounted Cumulative Gain (NDCG):
- Weighting by where on the list the booked property appears
- 1: booked property is the first recommended property
- 0: booked property not among the first 20
- Weighting by where on the list the booked property appears
- Normal Discounted Cumulative Gain (NDCG):
Results: The Biased Matchbox provided the best result, with 308x uplift to conversions, based on the test data.
What the numbers mean
At 49%, the Biased Matchbox result means that using our recommender model, 1-in-2 searches would result in the booked property being in the first 20 results, ie. on the first page. By comparison, the benchmark figure, at 0.16%, is 1-in-600. That’s a huge lift in relevance which, all other things being equal, should drive a significant uplift in conversions.
While this project was a proof-of-concept that is not yet in production*, the work was successful in demonstrating that a recommendation engine could be developed without personal user data and without being able to identify who a user is. Using historical interaction data (clicks, bookings and property features), the prototype delivered a substantial uplift in performance for all of the recommendation methods applied; in particular, the two Biased Matrix Factorisation methods performed exceptionally well, with an increase of 300x over the benchmark conversion rate.
Recommendation engines can have a major impact on page conversion and purchases on sites where anonymous or unidentified users are searching and browsing a wide range of content, products or services.
Clustering was used in this project to support the recommendation engine. Other notable applications of this machine learning technique include medical imaging, image segmentation, customer segmentation, social network analysis, anomaly detection and crime analysis.
How we can help you solve a similar problem
If you are looking to improve the conversion rate of your website or application and want to explore the use of recommendation and personalisation services, get in touch with one of our Client Engagement Principals to discuss your interest in this technology.
*This client cannot be named. Unfortunately their business was heavily affected by COVID-19 and as a result could not take the next step to get this PoC live. We hope to implement this solution in the future.