From foundation and large language models to prompt engineering and fine tuning, generative AI has become the talk of the tech town. So, with the recent explosion of AI, are business priorities shifting? And is generative AI the way of the future?
Last week, we heard from Kirsty Miller, Head of Strategy and Growth and Shahin Namin, Senior Machine Learning Engineer at DiUS on the latest trends and insights into how our clients are experimenting, applying and optimising with AI/ML.
Watch the live stream replay of their talk here: Are AI/ML business priorities shifting?
Following their talk, Kirsty and Shahin opened the floor to questions, and while they tried to answer all of them during the session, there were many more to unpack. So, we thought we’d capture them here for anyone who either missed their talk or wanted to refer back to them.
For visual models, should the dataset be labelled (like images with descriptions)? And if not, what alternative approaches can be explored to address the challenges associated with dataset labelling of visual model training?
Depending on the model, the expected dataset is different. For example, with image generation diffusion models, usually the dataset required is a prompt, along with the associated image. For other types of models, such as image variation, the dataset the model expects is an input image and a prompt, and the modified image according to the prompt. With some other models which perform image to image translation, you might be able to train your model with unpaired sets of images. e.g. If your model is going to learn to change the image of a scene in summer to a scene in winter, and vice versa, you can train the model as long as you have images of both summer and winter. So the short answer is, however, it depends on the model and application.
Have you come across anyone who has started/considered using ML, but given up?
When it comes to ML, it’s not just performing one experiment, proving value and putting it into production and you are done. Or having one failed experiment and giving up. Like many other technologies, especially emerging ones, success requires continual investment and experimentation.
We know of many organisations that have done a proof of concept and not taken the next steps to put it into production. This can be because of model accuracy / performance issues or lack of internal MLOps skills. When moving from proof of concept to production, there may be many other considerations such as:
- Does the model need to be re-written in the appropriate language and with robustness and design for production and scale?
- How will your model be integrated into an application or feature?
- What is the user experience associated with that application or feature?
- Does your solution require a human-in-the-loop?
- What automated infrastructure and data pipelines are required to support your model / solution?
- Do you have a framework to monitor, retrain and improve your model to ensure that it continues to perform in a real-world context?
Generally, we find these organisations don’t give up, rather they are stuck in ‘proof-of-concept purgatory’. If there is business value, organisations tend to persist and find a way forward.
However, if a proof of concept hasn’t delivered business value then an organisation may be less likely to persist. We advise organisations to start an ML journey by picking a problem that delivers business value, but is also solvable. Demonstrating progress and success early on will deliver learnings and enable the momentum to continue.
The following tips provide some guidance for picking the right problem to solve with ML:
- Use design thinking to select the most promising ML ideas. The sweet spot for innovation is at the intersection of desirability, feasibility and viability. Applying the same lenses to ML ideas can help assess ones with the most potential.
- You can start with a relatively small dataset. It’s much better to start with what you have and explore ways of expanding your dataset through data augmentation techniques and using external data sources.
- Look at data challenges as an opportunity. By running an ML experiment, you can gain insights into the quality of your data and its challenges to help inform or reinforce your data strategy for ML, which may include a refined data collection approach.
- To experiment well, think about more than just the tech. Beyond proving the technology, experiments can be designed to validate assumptions that relate to desirability. These can be run early on and in parallel to your tech activities.
From which industries do the AI/ML survey respondents come from?
The 393 respondents come from a broad range of industries:
- Computers and Electronics
- Financial Services
- Travel and Leisure
- Life Sciences
- Media and Advertising
- Primary Industries
- Professional Services
- Public Sector
- Software and Internet
Are there any meaningful insights when you break the AI/ML survey results down by industry?
There weren’t any strong differences regarding the adoption of ML within industry sectors. Based on our commercial experience, we would have expected fintech, industrial and other digital-based businesses that collect a large amount of data to present more strongly in comparison to other industries.
The survey findings did however show that larger organisations have a higher ML adoption rate than smaller ones. We posited that larger organisations would be more likely to have a dedicated strategy and capability to drive ML success. It might also be that larger and more mature organisations have a history of data governance and collecting data, making ML a much easier next step in terms of leveraging data for insight and value.
Although our findings show that larger organisations are doing better, our commercial experience is that smaller organisations, including startups, are in fact succeeding with ML. For startups and smaller organisations, we’ve observed that unless ML is part of a core business proposition, it’s less likely to be considered until they achieve market fit and are ready to scale.
Do standard models have limits on the number of tokens that they can process?
Yes, for instance, OpenAI. Depending on the model used, requests can use up to 4097 tokens shared between prompt and completion.
How much is the cost of a predictive ML project?
We usually run a proof-of-concept (POC) for such projects and will focus purely on model performance. Usually our engagements for POCs and model development is in the 4-6 week ballpark. After we’re happy with the model and its performance, we move on to productionisation. However, depending on the application, we might be able to train the right model in a shorter amount of time or in complicated cases, it might require more effort.
DiUS’ AI/ML workshop helps companies get started with AI/ML or review and recommend strategies to improve an existing AI/ML-powered solution. With access to the right expertise, we can help you achieve your desired business outcomes.