CLNov 23, 2023
Minimizing Factual Inconsistency and Hallucination in Large Language ModelsMuneeswaran I, Shreya Saxena, Siva Prasad et al.
Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a loss of credibility and trust among users. To address this issue, we propose a multi-stage framework that generates the rationale first, verifies and refines incorrect ones, and uses them as supporting references to generate the answer. The generated rationale enhances the transparency of the answer and our framework provides insights into how the model arrived at this answer, by using this rationale and the references to the context. In this paper, we demonstrate its effectiveness in improving the quality of responses to drug-related inquiries in the life sciences industry. Our framework improves traditional Retrieval Augmented Generation (RAG) by enabling OpenAI GPT-3.5-turbo to be 14-25% more faithful and 16-22% more accurate on two datasets. Furthermore, fine-tuning samples based on our framework improves the accuracy of smaller open-access LLMs by 33-42% and competes with RAG on commercial models.
CHEM-PHAug 25, 2023
Synergistic Fusion of Graph and Transformer Features for Enhanced Molecular Property PredictionM V Sai Prakash, Siddartha Reddy N, Ganesh Parab et al.
Molecular property prediction is a critical task in computational drug discovery. While recent advances in Graph Neural Networks (GNNs) and Transformers have shown to be effective and promising, they face the following limitations: Transformer self-attention does not explicitly consider the underlying molecule structure while GNN feature representation alone is not sufficient to capture granular and hidden interactions and characteristics that distinguish similar molecules. To address these limitations, we propose SYN- FUSION, a novel approach that synergistically combines pre-trained features from GNNs and Transformers. This approach provides a comprehensive molecular representation, capturing both the global molecule structure and the individual atom characteristics. Experimental results on MoleculeNet benchmarks demonstrate superior performance, surpassing previous models in 5 out of 7 classification datasets and 4 out of 6 regression datasets. The performance of SYN-FUSION has been compared with other Graph-Transformer models that have been jointly trained using a combination of transformer and graph features, and it is found that our approach is on par with those models in terms of performance. Extensive analysis of the learned fusion model across aspects such as loss, latent space, and weight distribution further validates the effectiveness of SYN-FUSION. Finally, an ablation study unequivocally demonstrates that the synergy achieved by SYN-FUSION surpasses the performance of its individual model components and their ensemble, offering a substantial improvement in predicting molecular properties.
BMJul 2, 2024
Leveraging Latent Evolutionary Optimization for Targeted Molecule GenerationSiddartha Reddy N, Sai Prakash MV, Varun V et al.
Lead optimization is a pivotal task in the drug design phase within the drug discovery lifecycle. The primary objective is to refine the lead compound to meet specific molecular properties for progression to the subsequent phase of development. In this work, we present an innovative approach, Latent Evolutionary Optimization for Molecule Generation (LEOMol), a generative modeling framework for the efficient generation of optimized molecules. LEOMol leverages Evolutionary Algorithms, such as Genetic Algorithm and Differential Evolution, to search the latent space of a Variational AutoEncoder (VAE). This search facilitates the identification of the target molecule distribution within the latent space. Our approach consistently demonstrates superior performance compared to previous state-of-the-art models across a range of constrained molecule generation tasks, outperforming existing models in all four sub-tasks related to property targeting. Additionally, we suggest the importance of including toxicity in the evaluation of generative models. Furthermore, an ablation study underscores the improvements that our approach provides over gradient-based latent space optimization methods. This underscores the effectiveness and superiority of LEOMol in addressing the inherent challenges in constrained molecule generation while emphasizing its potential to propel advancements in drug discovery.
QUANT-PHFeb 10
Surrogate-Guided Quantum Discovery in Black-Box Landscapes with Latent-Quadratic Interaction Embedding TransformersSaisubramaniam Gopalakrishnan, Dagnachew Birru
Discovering configurations that are both high-utility and structurally diverse under expensive black-box evaluation and strict query budgets remains a central challenge in data-driven discovery. Many classical optimizers concentrate on dominant modes, while quality-diversity methods require large evaluation budgets to populate high-dimensional archives. Quantum Approximate Optimization Algorithm (QAOA) provides distributional sampling but requires an explicit problem Hamiltonian, which is unavailable in black-box settings. Practical quantum circuits favor quadratic Hamiltonians since higher-order interaction terms are costly to realize. Learned quadratic surrogates such as Factorization Machines (FM) have been used as proxies, but are limited to pairwise structure. We extend this surrogate-to-Hamiltonian approach by modelling higher-order variable dependencies via self-attention and projects them into a valid Positive Semi-Definite quadratic form compatible with QAOA. This enables diversity-oriented quantum sampling from learned energy landscapes while capturing interaction structure beyond pairwise terms. We evaluate on risk discovery for enterprise document processing systems against diverse classical optimizers. Quantum-guided samplers achieve competitive utility while consistently improving structural diversity and exclusive discovery. FM surrogates provide stronger early coverage, whereas ours yields higher-fidelity surrogate landscapes and better extreme-case discovery. Our method recovers roughly twice as many structurally tail-risk outliers as most classical baselines and identify an exclusive non-overlapping fraction of high-utility configurations not found by competing methods, highlighting that an effective mechanism for learning higher-order interaction structure and projecting it into quadratic surrogate Hamiltonians for quantum-assisted black-box discovery.
AIJan 29
Search-Based Risk Feature Discovery in Document Structure Spaces under a Constrained BudgetSaisubramaniam Gopalakrishnan, Harikrishnan P M, Dagnachew Birru
Enterprise-grade Intelligent Document Processing (IDP) systems support high-stakes workflows across finance, insurance, and healthcare. Early-phase system validation under limited budgets mandates uncovering diverse failure mechanisms, rather than identifying a single worst-case document. We formalize this challenge as a Search-Based Software Testing (SBST) problem, aiming to identify complex interactions between document variables, with the objective to maximize the number of distinct failure types discovered within a fixed evaluation budget. Our methodology operates on a combinatorial space of document configurations, rendering instances of structural \emph{risk features} to induce realistic failure conditions. We benchmark a diverse portfolio of search strategies spanning evolutionary, swarm-based, quality-diversity, learning-based, and quantum under identical budget constraints. Through configuration-level exclusivity, win-rate, and cross-temporal overlap analyses, we show that different solvers consistently uncover failure modes that remain undiscovered by specific alternatives at comparable budgets. Crucially, cross-temporal analysis reveals persistent solver-specific discoveries across all evaluated budgets, with no single strategy exhibiting absolute dominance. While the union of all solvers eventually recovers the observed failure space, reliance on any individual method systematically delays the discovery of important risks. These results demonstrate intrinsic solver complementarity and motivate portfolio-based SBST strategies for robust industrial IDP validation.
CLSep 26, 2025
Thinking in Many Modes: How Composite Reasoning Elevates Large Language Model Performance with Limited DataZishan Ahmad, Saisubramaniam Gopalakrishnan
Large Language Models (LLMs), despite their remarkable capabilities, rely on singular, pre-dominant reasoning paradigms, hindering their performance on intricate problems that demand diverse cognitive strategies. To address this, we introduce Composite Reasoning (CR), a novel reasoning approach empowering LLMs to dynamically explore and combine multiple reasoning styles like deductive, inductive, and abductive for more nuanced problem-solving. Evaluated on scientific and medical question-answering benchmarks, our approach outperforms existing baselines like Chain-of-Thought (CoT) and also surpasses the accuracy of DeepSeek-R1 style reasoning (SR) capabilities, while demonstrating superior sample efficiency and adequate token usage. Notably, CR adaptively emphasizes domain-appropriate reasoning styles. It prioritizes abductive and deductive reasoning for medical question answering, but shifts to causal, deductive, and inductive methods for scientific reasoning. Our findings highlight that by cultivating internal reasoning style diversity, LLMs acquire more robust, adaptive, and efficient problem-solving abilities.
LGJul 23, 2025
Leveraging Knowledge Graphs and LLM Reasoning to Identify Operational Bottlenecks for Warehouse Planning AssistanceRishi Parekh, Saisubramaniam Gopalakrishnan, Zishan Ahmad et al.
Analyzing large, complex output datasets from Discrete Event Simulations (DES) of warehouse operations to identify bottlenecks and inefficiencies is a critical yet challenging task, often demanding significant manual effort or specialized analytical tools. Our framework integrates Knowledge Graphs (KGs) and Large Language Model (LLM)-based agents to analyze complex Discrete Event Simulation (DES) output data from warehouse operations. It transforms raw DES data into a semantically rich KG, capturing relationships between simulation events and entities. An LLM-based agent uses iterative reasoning, generating interdependent sub-questions. For each sub-question, it creates Cypher queries for KG interaction, extracts information, and self-reflects to correct errors. This adaptive, iterative, and self-correcting process identifies operational issues mimicking human analysis. Our DES approach for warehouse bottleneck identification, tested with equipment breakdowns and process irregularities, outperforms baseline methods. For operational questions, it achieves near-perfect pass rates in pinpointing inefficiencies. For complex investigative questions, we demonstrate its superior diagnostic ability to uncover subtle, interconnected issues. This work bridges simulation modeling and AI (KG+LLM), offering a more intuitive method for actionable insights, reducing time-to-insight, and enabling automated warehouse inefficiency evaluation and diagnosis.
LGDec 12, 2020
Knowledge Capture and Replay for Continual LearningSaisubramaniam Gopalakrishnan, Pranshu Ranjan Singh, Haytham Fayek et al.
Deep neural networks have shown promise in several domains, and the learned data (task) specific information is implicitly stored in the network parameters. Extraction and utilization of encoded knowledge representations are vital when data is no longer available in the future, especially in a continual learning scenario. In this work, we introduce {\em flashcards}, which are visual representations that {\em capture} the encoded knowledge of a network as a recursive function of predefined random image patterns. In a continual learning scenario, flashcards help to prevent catastrophic forgetting and consolidating knowledge of all the previous tasks. Flashcards need to be constructed only before learning the subsequent task, and hence, independent of the number of tasks trained before. We demonstrate the efficacy of flashcards in capturing learned knowledge representation (as an alternative to the original dataset) and empirically validate on a variety of continual learning tasks: reconstruction, denoising, task-incremental learning, and new-instance learning classification, using several heterogeneous benchmark datasets. Experimental evidence indicates that: (i) flashcards as a replay strategy is { \em task agnostic}, (ii) performs better than generative replay, and (iii) is on par with episodic replay without additional memory overhead.
LGApr 16, 2020
Classify and Generate: Using Classification Latent Space Representations for Image GenerationsSaisubramaniam Gopalakrishnan, Pranshu Ranjan Singh, Yasin Yazici et al.
Utilization of classification latent space information for downstream reconstruction and generation is an intriguing and a relatively unexplored area. In general, discriminative representations are rich in class-specific features but are too sparse for reconstruction, whereas, in autoencoders the representations are dense but have limited indistinguishable class-specific features, making them less suitable for classification. In this work, we propose a discriminative modeling framework that employs manipulated supervised latent representations to reconstruct and generate new samples belonging to a given class. Unlike generative modeling approaches such as GANs and VAEs that aim to model the data manifold distribution, Representation based Generations (ReGene) directly represent the given data manifold in the classification space. Such supervised representations, under certain constraints, allow for reconstructions and controlled generations using an appropriate decoder without enforcing any prior distribution. Theoretically, given a class, we show that these representations when smartly manipulated using convex combinations retain the same class label. Furthermore, they also lead to the novel generation of visually realistic images. Extensive experiments on datasets of varying resolutions demonstrate that ReGene has higher classification accuracy than existing conditional generative models while being competitive in terms of FID.