LGNov 7, 2024

Enhancing classroom teaching with LLMs and RAG

arXiv:2411.04341v17 citationsh-index: 15SIGITE
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of outdated information in LLMs for educational applications, but it is incremental as it primarily evaluates data source effectiveness rather than achieving high performance.

The study investigated using Retrieval-Augmented Generation (RAG) pipelines with course materials to aid K-12 students, but initial tests with Reddit as a data source for cybersecurity information showed average answer correctness below 50%, indicating it is not a good source for such questions.

Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course materials serving as a data source, might help students in K-12 education. The initial research utilizes Reddit as a data source for up-to-date cybersecurity information. Chunk size is evaluated to determine the optimal amount of context needed to generate accurate answers. After running the experiment for different chunk sizes, answer correctness was evaluated using RAGAs with average answer correctness not exceeding 50 percent for any chunk size. This suggests that Reddit is not a good source to mine for data for questions about cybersecurity threats. The methodology was successful in evaluating the data source, which has implications for its use to evaluate educational resources for effectiveness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes