A Dialectic Pipeline for Improving LLM Robustness
This addresses the need for more robust and generalizable LLMs for users who cannot afford high computational costs, though it appears incremental as it builds on existing prompting techniques.
The paper tackles the problem of reducing hallucinations and improving output quality in LLMs without requiring extensive computational resources or domain-specific constraints, by proposing a dialectic pipeline that uses self-dialogue for reflection and correction. The method outperforms standard model answers and Chain-of-Thought prompting by significant margins across various datasets and model families.
Assessing ways in which Language Models can reduce their hallucinations and improve the outputs' quality is crucial to ensure their large-scale use. However, methods such as fine-tuning on domain-specific data or the training of a separate \textit{ad hoc} verifier require demanding computational resources (not feasible for many user applications) and constrain the models to specific fields of knowledge. In this thesis, we propose a dialectic pipeline that preserves LLMs' generalization abilities while improving the quality of its answer via self-dialogue, enabling it to reflect upon and correct tentative wrong answers. We experimented with different pipeline settings, testing our proposed method on different datasets and on different families of models. All the pipeline stages are enriched with the relevant context (in an oracle-RAG setting) and a study on the impact of its summarization or its filtering is conducted. We find that our proposed dialectic pipeline is able to outperform by significative margins the standard model answers and that it consistently achieves higher performances than Chain-of-Thought only prompting.