Muhammad Hazim Al Farouq

h-index8

3papers

5citations

Novelty18%

AI Score23

Ranked #181,395 of 201,326 authors (top 90%)#30,203 in CL (top 93%)

3 Papers

CLFeb 17, 2025

SpeechT: Findings of the First Mentorship in Speech Translation

Yasmin Moslem, Juan Julián Cea Morán, Mariano Gonzalez-Gomez et al.

This work presents the details and findings of the first mentorship in speech translation (SpeechT), which took place in December 2024 and January 2025. To fulfil the mentorship requirements, the participants engaged in key activities, including data preparation, modelling, and advanced research. The participants explored data augmentation techniques and compared end-to-end and cascaded speech translation systems. The projects covered various languages other than English, including Arabic, Bengali, Galician, Indonesian, Japanese, and Spanish.

CLOct 26, 2025

Iterative Layer Pruning for Efficient Translation Inference

Yasmin Moslem, Muhammad Hazim Al Farouq, John D. Kelleher

Large language models (LLMs) have transformed many areas of natural language processing, including machine translation. However, efficient deployment of LLMs remains challenging due to their intensive computational requirements. In this paper, we address this challenge and present our submissions to the Model Compression track at the Conference on Machine Translation (WMT 2025). In our experiments, we investigate iterative layer pruning guided by layer importance analysis. We evaluate this method using the Aya-Expanse-8B model for translation from Czech to German, and from English to Egyptian Arabic. Our approach achieves substantial reductions in model size and inference time, while maintaining the translation quality of the baseline models.

CLMay 5, 2025

Bemba Speech Translation: Exploring a Low-Resource African Language

Muhammad Hazim Al Farouq, Aman Kassahun Wassie, Yasmin Moslem

This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2025), low-resource languages track, namely for Bemba-to-English speech translation. We built cascaded speech translation systems based on Whisper and NLLB-200, and employed data augmentation techniques, such as back-translation. We investigate the effect of using synthetic data and discuss our experimental setup.