CLApr 24, 2024
Hybrid LLM/Rule-based Approaches to Business Insights Generation from Structured DataAliaksei Vertsel, Mikhail Rumiantsau
In the field of business data analysis, the ability to extract actionable insights from vast and varied datasets is essential for informed decision-making and maintaining a competitive edge. Traditional rule-based systems, while reliable, often fall short when faced with the complexity and dynamism of modern business data. Conversely, Artificial Intelligence (AI) models, particularly Large Language Models (LLMs), offer significant potential in pattern recognition and predictive analytics but can lack the precision necessary for specific business applications. This paper explores the efficacy of hybrid approaches that integrate the robustness of rule-based systems with the adaptive power of LLMs in generating actionable business insights.
CLOct 26, 2024
Beyond Fine-Tuning: Effective Strategies for Mitigating Hallucinations in Large Language Models for Data AnalyticsMikhail Rumiantsau, Aliaksei Vertsel, Ilya Hrytsuk et al.
Large Language Models (LLMs) have become increasingly important in natural language processing, enabling advanced data analytics through natural language queries. However, these models often generate "hallucinations"-inaccurate or fabricated information-that can undermine their reliability in critical data-driven decision-making. Addressing the challenge of hallucinations is essential to improve the accuracy and trustworthiness of LLMs in processing natural language queries. This research focuses on mitigating hallucinations in LLMs, specifically within the context of data analytics. We introduce and evaluate four targeted strategies: Structured Output Generation, Strict Rules Enforcement, System Prompt Enhancements, and Semantic Layer Integration. Our findings show that these methods are more effective than traditional fine-tuning approaches in reducing hallucinations, offering a more reliable framework for deploying LLMs in natural language queries for data analytics. This research demonstrates the potential of these strategies to enhance the accuracy of LLM-driven data queries, ensuring more dependable results in data-driven environments.
CLJul 2, 2018
Pragmatic approach to structured data querying via natural language interfaceAliaksei Vertsel, Mikhail Rumiantsau
As the use of technology increases and data analysis becomes integral in many businesses, the ability to quickly access and interpret data has become more important than ever. Information retrieval technologies are being utilized by organizations and companies to manage their information systems and processes. Despite information retrieval of a large amount of data being efficient organized in relational databases, a user still needs to master the DB language/schema to completely formulate the queries. This puts a burden on organizations and companies to hire employees that are proficient in DB languages/schemas to formulate queries. To reduce some of the burden on already overstretched data teams, many organizations are looking for tools that allow non-developers to query their databases. Unfortunately, writing a valid SQL query that answers the question a user is trying to ask isn't always easy. Even seemingly simple questions, like "Which start-up companies received more than $200M in funding?" can actually be very hard to answer, let alone convert into a SQL query. How do you define start-up companies? By size, location, duration of time they have been incorporated? This may be fine if a user is working with a database they're already familiar with, but what if users are not familiar with the database. What is needed is a centralized system that can effectively translate natural language queries into specific database queries for different customer database types. There is a number of factors that can dramatically affect the system architecture and the set of algorithms used to translate NL queries into a structured query representation.