CASE STUDY

Programa: Utilising AI to maximise data efficiency

Combining machine learning and generative AI to tackle product catalogue deduplication

At a glance

Programa recognised it possessed valuable data that was not being effectively utilised and strategically sought to maximise its potential. Faced with the issue of duplicate entries in its product catalogue, Programa turned to DiUS for AI-driven solutions. DiUS’ sophisticated approach combined traditional machine learning with state-of-the-art generative AI to create a precise and efficient deduplication system. The initiative achieved a 90% match accuracy in identifying duplicate products, providing Programa with a solid foundation of clean catalogue data. This strategic effort supports improved user experience, search functionality, and SEO, positioning the company for continuous product innovation.

Our services

Meet Programa

Founded in 2020, Programa aims to modernise the manual, spreadsheet-dependent processes of the interior design and architecture industry. Its software enables designers and architects to focus on creativity and business by streamlining project management, process coordination, and product cataloguing.

The challenge

Establishing data foundations for enhanced user experience and product growth

Interior designers frequently use a wide range of products, such as chairs, side tables, lighting, and tapware, sourced from multiple brands. Within Programa, these products are either manually entered by designers, sourced from the web, or inputted directly from suppliers. Over time, this has led to a significant number of duplicate entries in Programa’s product catalogue, with thousands of new products added each day. This rapid growth led to a considerable number of duplicate entries, which negatively impacted user experience, product search efficiency, and recommendation accuracy. Moreover, the presence of duplicate products hindered Programa’s SEO performance and compromised the quality of analytics available to suppliers and interior designers.

Programa realised it had valuable data that was not being utilised and strategically sought to make better use of it. Recognising the importance of a clean, master product catalogue, Programa aimed to start by deduplicating its extensive product database to provide a foundational dataset to build on. The objectives were to boost SEO performance, to provide accurate analytics for both suppliers and interior designers, and to use the data to drive new product features.

To achieve these goals, Programa sought the expertise of DiUS to explore the application of AI and machine learning technologies. Programa selected DiUS for its demonstrated expertise in AI and machine learning, as well as its innovative and practical approach to problem-solving.

What we did

Innovative use of traditional machine learning and LLMs to solve catalogue duplication challenges

DiUS’s ability to quickly prototype and experiment with solutions provided Programa the confidence needed to advance the ML initiative. After a tightly timeboxed experiment proved feasibility and value, the solution was put into production.

To tackle the duplication challenge, DiUS implemented a strategy that combined traditional machine learning techniques with advanced generative AI technologies. The core of the solution involved deduplicating the product catalogue using a multi-modal model that integrated both text and image data to effectively identify duplicate products. A robust data processing pipeline was established to manage the daily influx of new products. This pipeline included several critical steps: data ingestion, data processing, embedding generation, and duplicate detection.

The cleaned and indexed product list resulting from this deduplication process was stored in Amazon OpenSearch, facilitating efficient search and retrieval of product information. To ensure scalable and automated operations, daily batch jobs were implemented. These jobs compared new product entries against the Master Product Catalogue, preventing the addition of duplicates.

 

For ongoing evaluation of the deduplication process, DiUS integrated Claude, a large language model (LLM) hosted on Amazon Bedrock. This integration enabled automatic assessment of the deduplication model’s accuracy, allowing Programa to drastically reduce the manual effort previously required for accuracy checks, improving operational efficiency.

We chose Amazon Bedrock for its ease of use, allowing for rapid experimentation across multiple LLM’s and replacement of models as newer more performant and cost-effective models such as Claude 3.5 emerge.

Results for Programa

Empowering growth and innovation with high-accuracy deduplication

The implementation of the deduplication solution for Programa brought about several significant benefits. The multi-modal model achieved high accuracy in identifying duplicates, with strong performance metrics such as precision, recall, and F1 score. Specifically, the deduplication model reached a 90% match accuracy in identifying duplicate products.

This improved accuracy supports an enhanced user experience, with better search functionality and more relevant product recommendations to users. The cleaner product catalogue also supports improved SEO performance and enables more precise analytics, providing valuable insights for both suppliers and designers.

Operational efficiency saw a marked improvement as well. The automated evaluation process using LLMs reduced manual verification efforts from several hours to just minutes, streamlining the deduplication verification process and providing Programa the confidence to deploy to production. 

Building on the accurate data provided by the foundational deduplication work, Programa has already developed a new auto-complete feature that is currently in testing. This feature aims to enhance productivity for designers by preemptively suggesting products as they are added to projects, reducing data entry time and ensuring greater accuracy. Moreover, the data is now being used for valuable analytics, aiding Programa’s sales and marketing teams. The foundational work has also opened up new opportunities, like helping suppliers better connect with their target audience by claiming their products on Programa’s platform.

Furthermore, in a testament to the successful collaborative approach between DiUS and Programa, the solution has been successfully transitioned to Programa’s team and is working well. Programa’s own data and AI specialists now manage the deduplication system, ensuring it continues to provide value and is adapted to future needs. The foundational work laid by DiUS has empowered Programa to build on this success and explore new opportunities for innovation and growth.

Words from our customer

“DiUS came in with a clear, time-boxed approach that focused on delivering value quickly using machine learning and generative AI. They didn’t over-engineer the solution; instead, they experimented and showed us what was possible within a few weeks. This approach gave us the confidence to achieve our goals without unnecessary risks. DiUS’s expertise and focused strategy enabled us to take ownership of the solution, allowing us to continue building and innovating on a strong data foundation. This collaboration has been instrumental in driving our growth and exploring new business opportunities.”

-Steve Bartlett, CTO, Programa

Want to know more about how DiUS can help you?

Offices

Melbourne
Level 3, 31 Queen St Melbourne, Victoria, 3000

Phone: 03 9008 5400

Sydney
The Commons

32 York St Sydney,

New South Wales, 2000

DiUS wishes to acknowledge the Traditional Custodians of the lands on which we work and gather at both our Melbourne and Sydney offices. We pay respect to Elders past, present and emerging and celebrate the diversity of Aboriginal peoples and their ongoing cultures and connections to the lands and waters of Australia.

Subscribe to updates from DiUS

Sign up to receive the latest news, insights and event invites from DiUS straight into your inbox.

© 2024 DiUS®. All rights reserved.

Privacy  |  Terms