Fusionista2.0: Efficiency Retrieval System for Large-Scale Datasets
This addresses the need for fast and user-friendly video search systems in competitive benchmarks like the Video Browser Showdown, representing an incremental improvement with technical upgrades.
The paper tackled the problem of efficient video retrieval under strict time constraints by presenting Fusionista2.0, a streamlined system that reduced retrieval time by up to 75% while increasing accuracy and user satisfaction.
The Video Browser Showdown (VBS) challenges systems to deliver accurate results under strict time constraints. To meet this demand, we present Fusionista2.0, a streamlined video retrieval system optimized for speed and usability. All core modules were re-engineered for efficiency: preprocessing now relies on ffmpeg for fast keyframe extraction, optical character recognition uses Vintern-1B-v3.5 for robust multilingual text recognition, and automatic speech recognition employs faster-whisper for real-time transcription. For question answering, lightweight vision-language models provide quick responses without the heavy cost of large models. Beyond these technical upgrades, Fusionista2.0 introduces a redesigned user interface with improved responsiveness, accessibility, and workflow efficiency, enabling even non-expert users to retrieve relevant content rapidly. Evaluations demonstrate that retrieval time was reduced by up to 75% while accuracy and user satisfaction both increased, confirming Fusionista2.0 as a competitive and user-friendly system for large-scale video search.