ARCLROASOct 12, 2025

Bhasha-Rupantarika: Algorithm-Hardware Co-design approach for Multilingual Neural Machine Translation

arXiv:2510.10676v1h-index: 23Has Code
Originality Incremental advance
AI Analysis

This provides a deployable solution for real-time multilingual translation on IoT devices, though it is incremental as it applies known quantization techniques to a specific hardware-software context.

This paper tackles efficient multilingual neural machine translation for resource-limited settings by developing Bhasha-Rupantarika, an algorithm-hardware co-design system that uses ultra-low precision quantization (down to FP4), achieving a 4.1x reduction in model size, 4.2x speedup in inference, and 66 tokens/s throughput (4.8x improvement).

This paper introduces Bhasha-Rupantarika, a light and efficient multilingual translation system tailored through algorithm-hardware codesign for resource-limited settings. The method investigates model deployment at sub-octet precision levels (FP8, INT8, INT4, and FP4), with experimental results indicating a 4.1x reduction in model size (FP4) and a 4.2x speedup in inference speed, which correlates with an increased throughput of 66 tokens/s (improvement by 4.8x). This underscores the importance of ultra-low precision quantization for real-time deployment in IoT devices using FPGA accelerators, achieving performance on par with expectations. Our evaluation covers bidirectional translation between Indian and international languages, showcasing its adaptability in low-resource linguistic contexts. The FPGA deployment demonstrated a 1.96x reduction in LUTs and a 1.65x decrease in FFs, resulting in a 2.2x enhancement in throughput compared to OPU and a 4.6x enhancement compared to HPTA. Overall, the evaluation provides a viable solution based on quantisation-aware translation along with hardware efficiency suitable for deployable multilingual AI systems. The entire codes [https://github.com/mukullokhande99/Bhasha-Rupantarika/] and dataset for reproducibility are publicly available, facilitating rapid integration and further development by researchers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes