The use of Large Language Models (LLMs) in chatbot applications has been increasingly common because they can offer a better user experience through conversational interactions. This blog provides a brief overview of traditional rule-based chatbot paradigms before exploring LLM-based chatbots and their core components. My objective is to offer a clear understanding of why LLMs are being integrated into chatbot technology and how different components of a LLM-based chatbot work together to meaningfully respond to users.
Traditional chatbots
Traditional chatbots work well in structured environments where user interactions are more predictable and straightforward. They are distinguished by the way that they assess the user’s input referred to as “utterances”, which are typically questions or commands. Traditional chatbots can be divided into two main categories:
- Rule–based chatbots operate by matching utterances against a predefined set of rules. The chatbot scans the user’s input for specific keywords or phrases and responds based on how its rules are programmed. If the user’s utterance closely matches one of these predefined rules, the chatbot can respond appropriately. However, these systems often struggle with variations in phrasing or unexpected inputs that don’t fit their limited set of rules. To address this, they may restrict the user by presenting options.
Intent-based chatbots operate by identifying the user’s intent from their utterances. The system categorises the input into predefined intents (the purpose or goal of the user’s input) and responds based on its understanding of this intent. This approach allows for a bit more flexibility of user input to a strictly rule-based system, as the chatbot can recognise the intent even if the exact phrasing varies. However, it still relies on having a well-defined set of intents which are mapped to predefined responses.
For handling more complex interactions, both types might use decision trees, guiding the conversation through a series of user choices. They may be restricted in the number of queries they can address by the friction associated with navigating the possible options. Extending their capabilities to address a broader range of queries involves defining new rules or intents, a process that requires manual effort to capture each additional utterance.
Component of a LLM-based chatbot
Large Language Model (LLM)
LLM-based chatbots use the inherent capabilities of an LLM to generate natural responses to text input. Their reasoning capabilities allow them to follow instructions, understand user queries and process and synthesise information from different sources. They are more capable conversationalists than traditional chatbots and are better where there are a wide range of questions that the user could ask, or input is more varied or complex.
The text input to an LLM is called a prompt. A prompt usually provides some instructions along with the user’s input. Optionally, it can also include additional structured information, such as a summary of the conversation so far (from the conversational memory) or external information (from a retrieval system).
Their main advantages over traditional chatbots are:
- Human-like responses: LLMs can generate natural, conversational responses that are closely tailored to the specific question asked, creating an experience that feels less like interacting with a machine and more like conversing with a human. This makes them far more likely to pass the Turing Test than a traditional chatbot.
- Handling multifaceted questions: LLMs are adept at considering various aspects of complex, multifaceted questions, providing comprehensive and coherent responses.
- Efficient answers: These models can often get directly to the answer without needing multiple layers of questions.
- Extensive built-in knowledge: LLM chatbots have access to a broad range of information from their extensive training datasets. This allows them to fill in gaps in user queries with relevant information and context.
Conversational memory
LLM based chatbots often utilise conversational memory to retain information from earlier in a conversation, which helps them to produce fluid and contextually aware interactions. The conversation history is updated as the user and LLM interact. In each request to the LLM, the conversation history will be identified so the LLM can consider how the user’s most recent question relates to the conversation.
In some cases, the conversational history may be presented to the LLM to reformulate the user’s question in light of the conversation. So if the conversation was about dogs and the user simply asks “How many legs do they have?”, the LLM will reformulate the question as “How many legs do dogs have?”, and this question will then be passed to the LLM to generate an answer. Segmenting the process in this manner enhances transparency, allowing for clearer identification of which stage might be responsible for any errors. Conversely, handling everything in a single request would obscure the LLM’s understanding of the conversation, making it challenging to identify errors in its interpretation of the conversation history.
While traditional chatbots are capable of basic conversational memory, their ability is constrained by their reliance on pre-programed rules and responses. For example, they may be programmed to remember specific pieces of information like names, preferences or previous choices. However, the capability of LLMs is significantly greater, and requires much less manual setup and programming. A LLM can use the conversational memory to build up the knowledge gathered in the course of the conversation to refine and enhance their subsequent responses.
Retrieval
Retrieval is the process of finding relevant information for the LLM to consider in its response to the user. Retrieval works by analysing an input (such as the user’s query), retrieving relevant information from a knowledge base (such as a collection of documents), and then passing this information to the LLM.
Retrieval is frequently used in LLM chatbots, as it allows extension of the pre-existing knowledge of the LLM (from the large amount of text data it was trained on) with information that you would like the chatbot to be able to discuss. This can help it to answer questions about information that it was not trained on, and can reduce the risk of it hallucinating (making up answers). It is easier and cheaper than the alternative method of fine tuning, which involves additional training of the LLM using prepared questions and answers based on your information. It is also much easier to continuously update a knowledge base that the LLM refers to rather than the LLM itself. This is particularly the case given new more capable models being released regularly.
A key advantage of an LLM chatbot over a traditional chatbot is that retrieval can be used to dramatically extend the number of questions that the chatbot is capable of answering. You do not need to manually define the question (or intent) and corresponding response for every extra question your chatbot can answer.
A retrieval strategy will regularly make use of a vector database and embeddings (numerical representations of text) to quickly search over a large volume of unstructured text data to identify text relevant to the user’s query.
Observation
Observation is a key, but often overlooked aspect of an LLM-based chatbot. It is essential to evaluate the performance of an LLM chatbot, a process which needs to take place both in the development of the chatbot and its operation. Observability tools help to identify errors and where they occur in the chatbot pipeline. It’s important to capture the whole process in order to identify the root cause of inaccurate or unhelpful responses. Additionally, it is important to gather user feedback, providing them with a way of identifying problematic responses.
With an LLM chatbot, there is significant ongoing monitoring required when compared to traditional chatbots, as the output is far less predictable and confined. Each use case will need to carefully assess this tradeoff against the benefits of having a more capable conversational chatbot.
Considerations when building a chatbot
Regardless of the chatbot you’re building, you will want to have a strong understanding of the problem you’re trying to solve. This involves knowing what users are asking, their objectives, and which chatbot paradigm is best suited to their needs. Recording user inquiries and objectives is essential to understanding the user problem and defining clear goals for the chatbot.
The investment of time and resources in developing an LLM chatbot differs significantly from that of a traditional chatbot. In a traditional setup, the focus is on defining specific rules for responses. In contrast, with an LLM chatbot, more effort is required to guide and control the model due to its unpredictability. It’s vital to ensure the LLM acknowledges its limitations, avoids providing incorrect information, stays on topic, and responds appropriately. This is especially important for customer-facing chatbots, where recording customer details might be necessary for follow-ups on problematic interactions.
Additionally, managing the retrieval process will involve regularly updating the information corpus, ensuring the chatbot has access to relevant and well-formatted data, and making adjustments when issues are observed
Finally, the evaluation process for an LLM chatbot is continuous, unlike the more discrete assessment of traditional chatbots. This ongoing evaluation is necessary due to the expansive and dynamic nature of the questions and responses in an LLM-based system, as opposed to the finite set in rule-based systems. The volume of requests may necessitate incorporating further LLMs to assist in automating evaluation.
The retrieval process represents an additional challenge that would not be present in a rule-based chatbot. You now have a data pipeline to manage. You will need to update the corpus of information that the chatbot considers andingest the documents into a digestible format. You want to ensure the LLM is being fed relevant information, and you may need to tweak the retrieval process based on observations.
The evaluation process for an LLM chatbot is ongoing. For a traditional chatbot it is more discreet, as the set of questions and responses in a rule based system is finite.
Why are LLMs being increasingly used in chatbot applications
Users have grown accustomed to the fluidity and convenience of text and messaging applications. Intent-based chatbots can roughly approximate this experience while maintaining close control over response provided to users. However, the widespread adoption of LLM-based chat applications, such as ChatGPT, has set a new expectation for conversational interfaces. This trend has exerted additional pressure on businesses to adopt LLM chatbots.
While user expectations for chat applications have evolved, it’s important to recognise that LLM-based chatbots and rule-based chatbots serve distinct purposes. The decision to implement one over the other should be grounded in a clear understanding of the specific problem at hand. The unpredictability, running cost and observational requirements of LLM chatbots which will make them unsuitable for certain applications.
This is particularly the case where there is significant risk associated with an incorrect or misleading response. If you would like to explore using an LLM to solve your problem, please reach out to me or the team at DiUS – we’d love to chat.