SDOct 21, 2022
Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text GenerationThien Nguyen, Nathalie Tran, Liuhui Deng et al.
Code-switching describes the practice of using more than one language in the same sentence. In this study, we investigate how to optimize a neural transducer based bilingual automatic speech recognition (ASR) model for code-switching speech. Focusing on the scenario where the ASR model is trained without supervised code-switching data, we found that semi-supervised training and synthetic code-switched data can improve the bilingual ASR system on code-switching speech. We analyze how each of the neural transducer's encoders contributes towards code-switching performance by measuring encoder-specific recall values, and evaluate our English/Mandarin system on the ASCEND data set. Our final system achieves 25% mixed error rate (MER) on the ASCEND English/Mandarin code-switching test set -- reducing the MER by 2.1% absolute compared to the previous literature -- while maintaining good accuracy on the monolingual test sets.
70.5CLMar 11
Multilingual Reasoning Gym: Multilingual Scaling of Procedural Reasoning EnvironmentsKonstantin Dobler, Simon Lehnerer, Federico Scozzafava et al.
We present the Multilingual Reasoning Gym, an extension of Reasoning Gym (Stojanovski et al., 2025), that procedurally generates verifiable reasoning problems across 14 languages. We translate templates for 94 tasks with native-speaker validation in 10 languages and targeted code or template adaptations to ensure linguistic naturalness. The Multilingual Reasoning Gym preserves the core benefits of the procedural generation approach used in the original Reasoning Gym, such as virtually unlimited problem instance generation and adjustable difficulty, and remains directly usable for Reinforcement Learning from Verifiable Rewards and evaluation settings. Problems in the Multilingual Reasoning Gym are parallel across languages, enabling crosslingually parallel data generation at massive scale due to the procedural nature of the environments. We release our implementation to support research into multilingual reasoning models.
67.1CLMar 11
mAceReason-Math: A Dataset of High-Quality Multilingual Math Problems Ready For RLVRKonstantin Dobler, Simon Lehnerer, Federico Scozzafava et al.
Reinforcement Learning with Verifiable Rewards (RLVR) has been successfully applied to significantly boost the capabilities of pretrained large language models, especially in the math and logic problem domains. However, current research and available training datasets remain English-centric. While mul- tilingual training data and benchmarks have been created in the past, they were not created with RLVR and current model capability in mind, and their level of difficulty is often too low to provide appropriate training signals for current models. To address this gap, we provide mAceReason-Math, a dataset of high-quality translations of challenging math problems sourced from a corpus specifically curated for RLVR (AceReason-Math). We further take specific care to clean and improve our translations, resulting in a coverage of 14 languages with more than 10,000 samples per language. We release the dataset to facilitate multilingual RLVR research and benchmarking in the research community.
IVSep 7, 2025
FASL-Seg: Anatomy and Tool Segmentation of Surgical ScenesMuraam Abdel-Ghani, Mahmoud Ali, Mohamed Ali et al.
The growing popularity of robotic minimally invasive surgeries has made deep learning-based surgical training a key area of research. A thorough understanding of the surgical scene components is crucial, which semantic segmentation models can help achieve. However, most existing work focuses on surgical tools and overlooks anatomical objects. Additionally, current state-of-the-art (SOTA) models struggle to balance capturing high-level contextual features and low-level edge features. We propose a Feature-Adaptive Spatial Localization model (FASL-Seg), designed to capture features at multiple levels of detail through two distinct processing streams, namely a Low-Level Feature Projection (LLFP) and a High-Level Feature Projection (HLFP) stream, for varying feature resolutions - enabling precise segmentation of anatomy and surgical instruments. We evaluated FASL-Seg on surgical segmentation benchmark datasets EndoVis18 and EndoVis17 on three use cases. The FASL-Seg model achieves a mean Intersection over Union (mIoU) of 72.71% on parts and anatomy segmentation in EndoVis18, improving on SOTA by 5%. It further achieves a mIoU of 85.61% and 72.78% in EndoVis18 and EndoVis17 tool type segmentation, respectively, outperforming SOTA overall performance, with comparable per-class SOTA results in both datasets and consistent performance in various classes for anatomy and instruments, demonstrating the effectiveness of distinct processing streams for varying feature resolutions.
CROct 8, 2016
Towards Policy Enforcement Point as a Service (PEPS)Arash Shaghaghi, Mohamed Ali, Kaafar et al.
In this paper, we coin the term Policy Enforcement as a Service (PEPS), which enables the provision of innovative inter-layer and inter-domain Access Control. We leverage the architecture of Software-Defined-Network (SDN) to introduce a common network-level enforcement point, which is made available to a range of access control systems. With our PEPS model, it is possible to have a `defense in depth' protection model and drop unsuccessful access requests before engaging the data provider (e.g. a database system). Moreover, the current implementation of access control within the `trusted' perimeter of an organization is no longer a restriction so that the potential for novel, distributed and cooperative security services can be realized. We conduct an analysis of the security requirements and technical challenges for implementing Policy Enforcement as a Service. To illustrate the benefits of our proposal in practice, we include a report on our prototype PEPS-enabled location-based access control.