CASE STUDY

Powersensor: Accelerating data driven decision making through automation

With key product metrics at its fingertips, Powersensor is driving towards scalable and sustainable growth.

At a glance

DiUS helped Powersensor to automate a process that was previously manual, time consuming, and expensive; enabling the startup to prioritise feature improvements, improve its business operations and significantly reduce its operational costs.

Our services:

Meet Powersensor

Powersensor is the first of its kind in Australia. It offers a range of self-install energy monitoring solutions that provide whole-of-house, real-time energy data—down to the appliance level.

The Powersensor solutions consists of one or two self-install IoT devices (sensors) and Wi-Fi-enabled plugs, which collect and analyse grid, appliance, and solar data using a unique cloud-based algorithm. The mobile app interfaces with the IoT devices to provide real-time and historic insights across one or many homes or (small businesses)—anytime and anywhere.

Additionally, Powersensor also allows customers within the energy supply chain, such as energy networks and retailers, to access fleet data from multiple sites to understand customer behaviour and manage broader challenges like grid demand and tariff implementation.

The challenge

Like many organisations, Powersensor has access to a wide range of data sources including its CRM (Customer Relationship Management), mobile app and the IoT devices themselves. However, as a startup, where budgets and resources are limited, Powersensor needed a way to leverage the data already being collected at an individual level and analyse it across multiple accounts to inform its product roadmap; a process performed manually by its product team.

What we did

Starting small, DiUS worked with Powersensor’s product and engineering teams to understand the business problem, along with the data and data sources available. They then workshopped how to bring the data in, where to store it, and how best to visualise it.

More often than not, data platforms can be cost prohibitive, let alone the internal capabilities needed to extract business value from the often disjointed data sources. Therefore, the joint DiUS and Powersensor team agreed on a minimal viable product that would be built incrementally in three stages, with each stage adding business value to Powersensor.

Stage one: automated report generation

The objective of this initial stage was to automate the report generation process and create the foundation of the data lake infrastructure.

Previously, the reports were manually generated by the Powersensor product team and involved downloading the account-level CSV files from the CRM and analysing the data in Google Sheets. These reports included the total number of Powersensor accounts being installed, the breakdown by product type, as well as the number of additional devices being added at an account level.

In just two weeks, the team built an automated solution that allowed account-level CSV files from the CRM to be uploaded to an Amazon S3 bucket. They used AWS Glue to process the data and store the output in S3 so that it could be queried using Amazon Athena—this also created the beginning of the data lake. Additionally, Amazon QuickSight was used to create an interactive dashboard to help visualise the data, saving three-four hours a month in manual effort.

Stage two: streaming fleet metric data

The objective of stage two was to provide a more flexible way of querying the data and scalable and less costly infrastructure for running analytics queries.

Powersensor processes millions of messages every day as devices in the field are measuring energy and sending data every 30 seconds. This data includes raw energy readings, fleet metrics such as battery levels, Bluetooth connectivity, and Wi-Fi strength—just to name a few.

However, Powersensor’s product and engineering teams had a hard time accessing this data, therefore, making it difficult to make informed product roadmap decisions. Additionally, asking the engineers to manually pull the data meant spending time away from building valuable product features.

The design challenges

Cost vs complexity

Storing the data in the data lake required the data catalogue to be kept up-to-date. Therefore, one option was to use a data warehouse solution to update the catalogue automatically. However, while using an out of the box solution would reduce system complexity and development effort, it would also significantly increase running costs. Therefore, the team decided to go with a file-based solution, taking care of the data partitioning and updating the data catalogue via scheduled jobs in AWS Glue.

The presentation and usability

Another key decision was to make the data queryable within Amazon Athena and not to spend time on building dashboards within Amazon QuickSight. By having access to the data, the product and engineering teams would be able to write queries and analyse the results, allowing them to better understand how the fleet was performing and helping them make more informed decisions.

The historical data

In order to further reduce costs, a final decision was made to only capture new fleet metrics in the data lake. That meant not migrating older fleet data into the data lake and saving time on the development effort required to backfill the historical data.

The outcome

In just four weeks, the team built a solution that would receive data into the raw zone of the data lake. Using Amazon Kinesis Data Firehose, the data is now partitioned by the date it was generated. This allows for the devices that aren’t always connected to send data when they do come online. A scheduled job then runs every night to prepare the data catalogue for the next day’s data, making it queryable in Amazon Athena.

Another key outcome was to maintain the data partitioning in S3 for query efficiency, ensuring that Amazon Athena only loads and processes the data needed for the time period of interest.

Finally, having easy access to this data has allowed the Powersensor product team to better understand how the fleet’s performing. Analysing the data is helping the team to prioritise new features. Furthermore, they can now easily measure if these new features are having the desired impact on fleet performance.

Stage three: aggregated energy data

The final stage was to build upon the first two stages and load the aggregated energy data into the data lake. This energy data is the most valuable data that Powersensor collects, but it’s also the largest in terms of volume and size. Due to the high value of the energy data, ingesting the incoming data, as well as the historical data was a key requirement.

The design challenges

The interval-based data

The energy data is interval-based by nature and receiving the updated data for each interval causes duplicates for the same time interval. The team had to implement a two step process to ensure that the latest version of the data was being retrieved.

Cost vs scalability

Based on the initial discussions with the Powersensor engineering team, it was agreed that a streaming data transformation job would be the most appropriate solution for deduplication. However, processing large volumes of data in real-time can be expensive, and as a result, the team designed a batch solution that would not only deliver the results needed, but also be more cost effective.

The outcome

In just six weeks, the team built a solution that implemented an ingestion and curation of the data pipeline for the energy data. For the first step, the same pattern to store new data as it came into the data lake was implemented as described in stage two, again partitioning the raw data based upon the date it was generated. This maintained the ability to query the energy data in the data lake.

In the second step, the team processed and deduplicated the records twice a day, then saved the deduplicated values into the curated zone consolidating and optimising file formats for the best performance. This means that both the raw unprocessed data and the deduplicated and curated data are both stored on S3 and can be queried via Amazon Athena.

Having the energy data within the data lake allows the Powersensor product team to analyse customer energy data, reduce the time to support customer enquiries, offer fleet energy data reporting as a feature for customers, and investigate opportunities for machine learning and other insights.

Other benefits of the data platform

In addition to the objectives of the three stages, building the data platform has helped achieve some other improvements. These included:

Reducing the cost of the operational database: Since the data is being constantly ingested into the data platform, the operational Amazon DynamoDB database doesn’t need to keep the entire history of the fleet view data, which has reduced the cost of the operational database by over $2,000 (USD) per month.
Providing further opportunity for cost reductions: Powersensor is now assessing the possibility of further reducing the amount of historical energy data being stored in the operational database, similarly to what was done with the fleet view data.

Results for Powersensor

By building a data platform, Powersensor’s product and engineering teams are now able to analyse the data across multiple accounts with a simple SQL query, while automating a process that was previously manual, time consuming and expensive.

The interactive dashboard provides data-derived insights such as kit installations and indicative sales numbers. The ability to query fleet data and energy data is helping to feed the evidence-based product roadmap. With key product metrics at its fingertips, Powersensor is not only able to prioritise feature and product improvements, it’s now able to drive towards scalable and sustainable growth, and improve its business operations.

Additionally, the data lake architecture has enabled Powersensor to reduce its operational costs by removing the data from Amazon DynamoDB and storing it in an S3 bucket at a much lower cost.

CASE STUDY

Powersensor: Accelerating data driven decision making through automation

At a glance

Meet Powersensor

The challenge