By Sonny Mai and Duy Tin Truong
Swoop Aero’s end-to-end drone logistics platform collects vast amounts of operational data related to flights, equipment, customer service, and software development. However, relying on SQL expertise to answer an increasing number of ad-hoc data led to bottlenecks, slowing decision-making.
That’s where DiUS came in. As part of our work with Swoop Aero, we were tasked with developing a solution that would allow non-technical users to retrieve data without relying on technical intervention. In this blog, we’ll share how we built a generative AI-powered text-to-SQL solution to help Swoop Aero break down these barriers. We’ll discuss the challenges we tackled, the tools we used—including Amazon Bedrock—and the lessons learned along the way. Whether you’re a tech enthusiast or exploring generative AI for business, we hope our experience provides valuable insights.
The challenge: Democratising data access
Swoop Aero’s teams needed to monitor stock levels, track flight statistics, and analyse customer service interactions, but these tasks required SQL knowledge and a deep understanding of the database schema. This meant that non-technical users had to rely on the technical team for data retrieval, creating bottlenecks and delaying access to critical information.
The reliance on technical staff for data queries also limited their ability to focus on higher-priority tasks. Swoop Aero needed a more efficient solution—one that would allow users across the business to query data directly using natural language, without requiring SQL knowledge.
The solution: Bridging natural language and data access with text-to-SQL
To tackle this challenge, the DiUS team worked with Swoop Aero to implement a text-to-SQL system that automatically converts natural language inputs into SQL queries. This system serves as a bridge between human language and Swoop Aero’s complex databases, allowing users to ask questions in plain English—without needing to write SQL or understand the underlying database structure.
Here’s how the process works:
- Natural language understanding: The system interprets the user’s question, breaking it down into logical components.
- Semantic parsing: It translates the question into a SQL query, mapping natural language expressions to database structures and operations.
- Database execution: The system executes the SQL query against the appropriate database.
- Results returned: Finally, the requested data is presented to the user in a readable format.
The system architecture combined several AWS services to deliver a robust solution:
- Amazon Cognito provided user permission-based access to Swoop Aero’s core platform and data
- Amazon Bedrock provided the foundational AI models to interpret and translate natural language queries into SQL.
- AWS Lambda powered the serverless execution of SQL queries, ensuring efficient, on-demand processing.
- Amazon API Gateway enabled secure and scalable interaction between the chatbot interface and the backend.
- Amazon DynamoDB acted as a scalable NoSQL database, storing execution statuses, intermediate results, and chatbot memory for the text-to-SQL pipeline.
- Amazon RDS Postgres stored Swoop Aero’s data and facilitated fast query execution.
DiUS developed a generative AI-powered chatbot that allowed non-technical users at Swoop Aero to retrieve data insights by simply asking questions in natural language. This chatbot was integrated directly with Swoop Aero’s internal Amazon RDS Postgres databases, providing access to accurate, real-time information without the need for SQL expertise.

Phase 1: Building the foundation
In the first phase, we focused on automating SQL query generation specifically for Swoop Aero’s technical team. The goal was to build a core text-to-SQL pipeline that enabled SQL queries to be generated from natural language inputs through a chatbot interface. While this phase required some familiarity with the database structure, it allowed technical staff to retrieve data much more efficiently.
One of the main challenges in this phase was handling a database with hundreds of tables. Manually creating data schema descriptions for these tables was time-consuming and unsustainable, especially when the developer team updated the database. To tackle this, we automated the extraction of table descriptions directly from the database system, standardising them in a format that could be used as context for large language models (LLMs).
Through early experimentation, we discovered that feeding all table descriptions along with a user query to the LLM could cause confusion, leading to SQL queries that referenced non-existent fields or tables. To solve this, we split the process into two steps. First, we had the LLM identify only the relevant tables for the question. Then, we used those selected tables as context for generating the SQL query. This two-step process significantly reduced the number of tables the model needed to consider—narrowing it down from hundreds to just a handful—greatly improving the model’s accuracy.
To ensure the accuracy of the generated SQL queries, we started with a small test set and built a validation tool that allowed users to provide feedback by marking queries as correct or incorrect. This feedback loop allowed us to refine the pipeline and improve its performance over time.
While this solution reduced the workload for Swoop Aero’s technical team, certain limitations persisted. Each user request was treated independently, which made it difficult for users to adjust previous queries. There was also a semantic gap between the non-technical users’ questions and how the data was stored—for example, locations were stored as abbreviations, and the terminology used by users didn’t always align with the table names in the database. Another limitation was the system’s inability to automatically correct SQL errors based on feedback from the database itself.
To address these challenges, the team moved into the second phase, which focused on enhancing the chatbot’s memory, improving schema descriptions, refining prompts, and implementing a self-correcting mechanism to further optimise the solution.
Phase 2: Empowering non-technical users
In the second phase, the focus shifted toward making the system more accessible for non-technical users at Swoop Aero. Several enhancements were introduced to improve user experience:
- Memory and context awareness: A memory feature was added to retain the context of previous queries. For instance, if a user asked for flight statistics for 2023, they could follow up with, “What about 2022?” without having to restate the entire query.
- Self-correcting SQL: To further improve accuracy, we introduced self-correcting SQL capabilities. The system now detects and fixes errors, such as incorrect column names, reducing the need for manual intervention.
- Schema and prompt storage: To simplify query-building and tuning of prompts for non-technical users, we stored schemas and prompts in AWS S3. This allowed users to experiment with and refine their queries without needing to interact with the code, further empowering teams to access data independently.
An important breakthrough in Phase 2 was the introduction of the column pruning agent. Large tables with hundreds of columns presented performance bottlenecks. To optimise performance, the column pruning agent dynamically removed irrelevant columns from the schema, speeding up query generation, reducing latency, and lowering operational costs.


Example use cases: Empowering Swoop Aero’s teams
The text-to-SQL chatbot is now fully integrated into various aspects of Swoop Aero’s operations, empowering business stakeholders to access critical data insights without technical assistance. In seconds, the system generates the SQL query and returns the results—saving time and freeing up technical resources.
For example, consider the following scenarios:
- The chatbot enables the business team to access key insights on demand. They can quickly retrieve flight data for a specific country or region to support strategic decisions.
- The Operations Manager can access insights that improve performance oversight, such as identifying which pilot has completed the most flights or pinpointing the most frequently flown locations.
- The Production Manager can efficiently track manufacturing workflows by listing all stop locations for a specific part or determining which team member has performed certain tasks the most times.
Key learnings throughout the development process
Throughout the development of the chatbot, we learned several key lessons about building an effective text-to-SQL solution:
- Data readiness is key: Properly labelled data and table annotations are needed for the AI to map natural language queries to the right data sources. Without this foundation, the system struggled to retrieve relevant data.
- Specialised agents boost performance: Breaking down tasks into specialised agents—such as intent agents and table agents—helped improve the accuracy of the generated SQL queries.
- Handling large datasets efficiently: The column pruning agent played a critical role in improving performance when processing large datasets. By trimming irrelevant columns, the system reduced token consumption and improved query execution speed.
- Minimising hallucinations: Occasionally, the model generated queries for non-existent tables or columns. This required ongoing refinement of prompts and validation techniques to reduce such hallucinations. Also reducing the temperature of the LLM helps.
Managing Swoop Aero’s extensive dataset, spanning over 300 tables, presented unique challenges. However, with a combination of specialised agents and careful optimisation, the DiUS team was able to deliver a solution that worked efficiently at scale.
A more efficient data-driven future
Looking ahead, we’re excited about exploring additional capabilities to keep up with Swoop Aero’s growing data needs. We’ve already seen how effective features like self-correcting SQL and memory enhance usability, making solutions even more intuitive for users. There is a strong interest in incorporating dynamic memory that can handle more complex multi-query tasks, ensuring the system continues to support seamless data access as the organisation expands.
For teams looking to implement similar solutions, we recommend starting small and testing often. The key to long-term success lies in gathering consistent user feedback, which not only improves the generative AI solution’s accuracy over time but also makes the system smarter and more adaptable to evolving business needs.