A GPU-Accelerated RAG-Based Telegram Assistant for Supporting Parallel Processing Students
This provides affordable, private AI tutoring for HPC education students, though it is incremental in applying existing methods to a specific domain.
The project tackled the need for continuous academic assistance for parallel processing students by developing a GPU-accelerated RAG-based Telegram bot, which delivered real-time personalized responses and demonstrated practical deployment on consumer hardware with improved inference latency.
This project addresses a critical pedagogical need: offering students continuous, on-demand academic assistance beyond conventional reception hours. I present a domain-specific Retrieval-Augmented Generation (RAG) system powered by a quantized Mistral-7B Instruct model and deployed as a Telegram bot. The assistant enhances learning by delivering real-time, personalized responses aligned with the "Introduction to Parallel Processing" course materials. GPU acceleration significantly improves inference latency, enabling practical deployment on consumer hardware. This approach demonstrates how consumer GPUs can enable affordable, private, and effective AI tutoring for HPC education.