Khaled Saleh

CV
h-index29
14papers
215citations
Novelty45%
AI Score46

14 Papers

CLFeb 2Code
MedAraBench: Large-Scale Arabic Medical Question Answering Dataset and Benchmark

Mouath Abu-Daoud, Leen Kharouf, Omar El Hajj et al.

Arabic remains one of the most underrepresented languages in natural language processing research, particularly in medical applications, due to the limited availability of open-source data and benchmarks. The lack of resources hinders efforts to evaluate and advance the multilingual capabilities of Large Language Models (LLMs). In this paper, we introduce MedAraBench, a large-scale dataset consisting of Arabic multiple-choice question-answer pairs across various medical specialties. We constructed the dataset by manually digitizing a large repository of academic materials created by medical professionals in the Arabic-speaking region. We then conducted extensive preprocessing and split the dataset into training and test sets to support future research efforts in the area. To assess the quality of the data, we adopted two frameworks, namely expert human evaluation and LLM-as-a-judge. Our dataset is diverse and of high quality, spanning 19 specialties and five difficulty levels. For benchmarking purposes, we assessed the performance of eight state-of-the-art open-source and proprietary models, such as GPT-5, Gemini 2.0 Flash, and Claude 4-Sonnet. Our findings highlight the need for further domain-specific enhancements. We release the dataset and evaluation scripts to broaden the diversity of medical data benchmarks, expand the scope of evaluation suites for LLMs, and enhance the multilingual capabilities of models for deployment in clinical settings.

CLFeb 5Code
MedErrBench: A Fine-Grained Multilingual Benchmark for Medical Error Detection and Correction with Clinical Expert Annotations

Congbo Ma, Yichun Zhang, Yousef Al-Jazzazi et al.

Inaccuracies in existing or generated clinical text may lead to serious adverse consequences, especially if it is a misdiagnosis or incorrect treatment suggestion. With Large Language Models (LLMs) increasingly being used across diverse healthcare applications, comprehensive evaluation through dedicated benchmarks is crucial. However, such datasets remain scarce, especially across diverse languages and contexts. In this paper, we introduce MedErrBench, the first multilingual benchmark for error detection, localization, and correction, developed under the guidance of experienced clinicians. Based on an expanded taxonomy of ten common error types, MedErrBench covers English, Arabic and Chinese, with natural clinical cases annotated and reviewed by domain experts. We assessed the performance of a range of general-purpose, language-specific, and medical-domain language models across all three tasks. Our results reveal notable performance gaps, particularly in non-English settings, highlighting the need for clinically grounded, language-aware systems. By making MedErrBench and our evaluation protocols publicly-available, we aim to advance multilingual clinical NLP to promote safer and more equitable AI-based healthcare globally. The dataset is available in the supplementary material. An anonymized version of the dataset is available at: https://github.com/congboma/MedErrBench.

NAMar 30, 2018
Low Mach number limit of some staggered schemes for compressible barotropic flows

R. Herbin, J. -C Latché, Khaled Saleh

In this paper, we study the behaviour at low Mach number of numerical schemes based on staggered discretizations for the barotropic Navier-Stokes equations. Three time discretizations are considered: the implicit-in-time scheme and two non-iterative pressure correction schemes. The last two schemes differ by the discretization of the convection term: linearly implicit for the first one, so the resulting scheme is unconditionnally stable, and explicit for the second one, so the scheme is stable under a CFL condition involving the material velocity only. We rigorously prove that these three variants are asymptotic preserving in the following sense: for a given mesh and a given time step, a sequence of solutions obtained with a sequence of vanishing Mach numbers tend to a solution of a standard scheme for incompressible flows. This convergence result is obtained by mimicking the proof already known in the continuous case.

NAOct 31, 2016
A Positive and Entropy-Satisfying Finite Volume Scheme for the Baer-Nunziato Model

Frédéric Coquel, Jean-Marc Hérard, Khaled Saleh

We present a relaxation scheme for approximating the entropy dissipating weak solutions of the Baer-Nunziato two-phase flow model. This relaxation scheme is straightforwardly obtained as an extension of the relaxation scheme designed in [16] for the isentropic Baer-Nunziato model and consequently inherits its main properties. To our knowledge, this is the only existing scheme for which the approximated phase fractions, phase densities and phase internal energies are proven to remain positive without any restrictive condition other than a classical fully computable CFL condition. For ideal gas and stiffened gas equations of state, real values of the phasic speeds of sound are also proven to be maintained by the numerical scheme. It is also the only scheme for which a discrete entropy inequality is proven, under a CFL condition derived from the natural sub-characteristic condition associated with the relaxation approximation. This last property, which ensures the non-linear stability of the numerical method, is satisfied for any admissible equation of state. We provide a numerical study for the convergence of the approximate solutions towards some exact Riemann solutions. The numerical simulations show that the relaxation scheme compares well with two of the most popular existing schemes available for the Baer-Nunziato model, namely Schwendeman-Wahle-Kapila's Godunov-type scheme [39] and Toro-Tokareva's HLLC scheme [42]. The relaxation scheme also shows a higher precision and a lower computational cost (for comparable accuracy) than a standard numerical scheme used in the nuclear industry, namely Rusanov's scheme. Finally, we assess the good behavior of the scheme when approximating vanishing phase solutions.

LGSep 19, 2022
Traffic incident duration prediction via a deep learning framework for text description encoding

Artur Grigorev, Adriana-Simona Mihaita, Khaled Saleh et al.

Predicting the traffic incident duration is a hard problem to solve due to the stochastic nature of incident occurrence in space and time, a lack of information at the beginning of a reported traffic disruption, and lack of advanced methods in transport engineering to derive insights from past accidents. This paper proposes a new fusion framework for predicting the incident duration from limited information by using an integration of machine learning with traffic flow/speed and incident description as features, encoded via several Deep Learning methods (ANN autoencoder and character-level LSTM-ANN sentiment classifier). The paper constructs a cross-disciplinary modelling approach in transport and data science. The approach improves the incident duration prediction accuracy over the top-performing ML models applied to baseline incident reports. Results show that our proposed method can improve the accuracy by $60\%$ when compared to standard linear or support vector regression models, and a further $7\%$ improvement with respect to the hybrid deep learning auto-encoded GBDT model which seems to outperform all other models. The application area is the city of San Francisco, rich in both traffic incident logs (Countrywide Traffic Accident Data set) and past historical traffic congestion information (5-minute precision measurements from Caltrans Performance Measurement System).

CVSep 20, 2022
Traffic Accident Risk Forecasting using Contextual Vision Transformers

Khaled Saleh, Artur Grigorev, Adriana-Simona Mihaita

Recently, the problem of traffic accident risk forecasting has been getting the attention of the intelligent transportation systems community due to its significant impact on traffic clearance. This problem is commonly tackled in the literature by using data-driven approaches that model the spatial and temporal incident impact, since they were shown to be crucial for the traffic accident risk forecasting problem. To achieve this, most approaches build different architectures to capture the spatio-temporal correlations features, making them inefficient for large traffic accident datasets. Thus, in this work, we are proposing a novel unified framework, namely a contextual vision transformer, that can be trained in an end-to-end approach which can effectively reason about the spatial and temporal aspects of the problem while providing accurate traffic accident risk predictions. We evaluate and compare the performance of our proposed methodology against baseline approaches from the literature across two large-scale traffic accident datasets from two different geographical locations. The results have shown a significant improvement with roughly 2\% in RMSE score in comparison to previous state-of-art works (SoTA) in the literature. Moreover, our proposed approach has outperformed the SoTA technique over the two datasets while only requiring 23x fewer computational requirements.

LGMar 20, 2024
Enhancing Traffic Incident Management with Large Language Models: A Hybrid Machine Learning Approach for Severity Classification

Artur Grigorev, Khaled Saleh, Yuming Ou et al.

This research showcases the innovative integration of Large Language Models into machine learning workflows for traffic incident management, focusing on the classification of incident severity using accident reports. By leveraging features generated by modern language models alongside conventional data extracted from incident reports, our research demonstrates improvements in the accuracy of severity classification across several machine learning algorithms. Our contributions are threefold. First, we present an extensive comparison of various machine learning models paired with multiple large language models for feature extraction, aiming to identify the optimal combinations for accurate incident severity classification. Second, we contrast traditional feature engineering pipelines with those enhanced by language models, showcasing the superiority of language-based feature engineering in processing unstructured text. Third, our study illustrates how merging baseline features from accident reports with language-based features can improve the severity classification accuracy. This comprehensive approach not only advances the field of incident management but also highlights the cross-domain application potential of our methodology, particularly in contexts requiring the prediction of event outcomes from unstructured textual data or features translated into textual representation. Specifically, our novel methodology was applied to three distinct datasets originating from the United States, the United Kingdom, and Queensland, Australia. This cross-continental application underlines the robustness of our approach, suggesting its potential for widespread adoption in improving incident management processes globally.

SYJun 3, 2025
Automated Traffic Incident Response Plans using Generative Artificial Intelligence: Part 1 -- Building the Incident Response Benchmark

Artur Grigorev, Khaled Saleh, Jiwon Kim et al.

Traffic incidents remain a critical public safety concern worldwide, with Australia recording 1,300 road fatalities in 2024, which is the highest toll in 12 years. Similarly, the United States reports approximately 6 million crashes annually, raising significant challenges in terms of a fast reponse time and operational management. Traditional response protocols rely on human decision-making, which introduces potential inconsistencies and delays during critical moments when every minute impacts both safety outcomes and network performance. To address this issue, we propose a novel Incident Response Benchmark that uses generative artificial intelligence to automatically generate response plans for incoming traffic incidents. Our approach aims to significantly reduce incident resolution times by suggesting context-appropriate actions such as variable message sign deployment, lane closures, and emergency resource allocation adapted to specific incident characteristics. First, the proposed methodology uses real-world incident reports from the Performance Measurement System (PeMS) as training and evaluation data. We extract historically implemented actions from these reports and compare them against AI-generated response plans that suggest specific actions, such as lane closures, variable message sign announcements, and/or dispatching appropriate emergency resources. Second, model evaluations reveal that advanced generative AI models like GPT-4o and Grok 2 achieve superior alignment with expert solutions, demonstrated by minimized Hamming distances (averaging 2.96-2.98) and low weighted differences (approximately 0.27-0.28). Conversely, while Gemini 1.5 Pro records the lowest count of missed actions, its extremely high number of unnecessary actions (1547 compared to 225 for GPT-4o) indicates an over-triggering strategy that reduces the overall plan efficiency.

CVDec 3, 2020
Pedestrian Trajectory Prediction using Context-Augmented Transformer Networks

Khaled Saleh

Forecasting the trajectory of pedestrians in shared urban traffic environments is still considered one of the challenging problems facing the development of autonomous vehicles (AVs). In the literature, this problem is often tackled using recurrent neural networks (RNNs). Despite the powerful capabilities of RNNs in capturing the temporal dependency in the pedestrians' motion trajectories, they were argued to be challenged when dealing with longer sequential data. Thus, in this work, we are introducing a framework based on the transformer networks that were shown recently to be more efficient and outperformed RNNs in many sequential-based tasks. We relied on a fusion of the past positional information, agent interactions information and scene physical semantics information as an input to our framework in order to provide a robust trajectory prediction of pedestrians. We have evaluated our framework on two real-life datasets of pedestrians in shared urban traffic environments and it has outperformed the compared baseline approaches in both short-term and long-term prediction horizons.

CVJun 8, 2020
Fast Synthetic LiDAR Rendering via Spherical UV Unwrapping of Equirectangular Z-Buffer Images

Mohammed Hossny, Khaled Saleh, Mohammed Attia et al.

LiDAR data is becoming increasingly essential with the rise of autonomous vehicles. Its ability to provide 360deg horizontal field of view of point cloud, equips self-driving vehicles with enhanced situational awareness capabilities. While synthetic LiDAR data generation pipelines provide a good solution to advance the machine learning research on LiDAR, they do suffer from a major shortcoming, which is rendering time. Physically accurate LiDAR simulators (e.g. Blensor) are computationally expensive with an average rendering time of 14-60 seconds per frame for urban scenes. This is often compensated for via using 3D models with simplified polygon topology (low poly assets) as is the case of CARLA (Dosovitskiy et al., 2017). However, this comes at the price of having coarse grained unrealistic LiDAR point clouds. In this paper, we present a novel method to simulate LiDAR point cloud with faster rendering time of 1 sec per frame. The proposed method relies on spherical UV unwrapping of Equirectangular Z-Buffer images. We chose Blensor (Gschwandtner et al., 2011) as the baseline method to compare the point clouds generated using the proposed method. The reported error for complex urban landscapes is 4.28cm for a scanning range between 2-120 meters with Velodyne HDL64-E2 parameters. The proposed method reported a total time per frame to 3.2 +/- 0.31 seconds per frame. In contrast, the BlenSor baseline method reported 16.2 +/- 1.82 seconds.

LGJun 4, 2020
Refined Continuous Control of DDPG Actors via Parametrised Activation

Mohammed Hossny, Julie Iskander, Mohammed Attia et al.

In this paper, we propose enhancing actor-critic reinforcement learning agents by parameterising the final actor layer which produces the actions in order to accommodate the behaviour discrepancy of different actuators, under different load conditions during interaction with the environment. We propose branching the action producing layer in the actor to learn the tuning parameter controlling the activation layer (e.g. Tanh and Sigmoid). The learned parameters are then used to create tailored activation functions for each actuator. We ran experiments on three OpenAI Gym environments, i.e. Pendulum-v0, LunarLanderContinuous-v2 and BipedalWalker-v2. Results have shown an average of 23.15% and 33.80% increase in total episode reward of the LunarLanderContinuous-v2 and BipedalWalker-v2 environments, respectively. There was no significant improvement in Pendulum-v0 environment but the proposed method produces a more stable actuation signal compared to the state-of-the-art method. The proposed method allows the reinforcement learning actor to produce more robust actions that accommodate the discrepancy in the actuators' response functions. This is particularly useful for real life scenarios where actuators exhibit different response functions depending on the load and the interaction with the environment. This also simplifies the transfer learning problem by fine tuning the parameterised activation layers instead of retraining the entire policy every time an actuator is replaced. Finally, the proposed method would allow better accommodation to biological actuators (e.g. muscles) in biomechanical systems.

CVMay 22, 2019
Domain Adaptation for Vehicle Detection from Bird's Eye View LiDAR Point Cloud Data

Khaled Saleh, Ahmed Abobakr, Mohammed Attia et al.

Point cloud data from 3D LiDAR sensors are one of the most crucial sensor modalities for versatile safety-critical applications such as self-driving vehicles. Since the annotations of point cloud data is an expensive and time-consuming process, therefore recently the utilisation of simulated environments and 3D LiDAR sensors for this task started to get some popularity. With simulated sensors and environments, the process for obtaining an annotated synthetic point cloud data became much easier. However, the generated synthetic point cloud data are still missing the artefacts usually exist in point cloud data from real 3D LiDAR sensors. As a result, the performance of the trained models on this data for perception tasks when tested on real point cloud data is degraded due to the domain shift between simulated and real environments. Thus, in this work, we are proposing a domain adaptation framework for bridging this gap between synthetic and real point cloud data. Our proposed framework is based on the deep cycle-consistent generative adversarial networks (CycleGAN) architecture. We have evaluated the performance of our proposed framework on the task of vehicle detection from a bird's eye view (BEV) point cloud images coming from real 3D LiDAR sensors. The framework has shown competitive results with an improvement of more than 7% in average precision score over other baseline approaches when tested on real BEV point cloud images.

CVApr 22, 2019
Real-time Intent Prediction of Pedestrians for Autonomous Ground Vehicles via Spatio-Temporal DenseNet

Khaled Saleh, Mohammed Hossny, Saeid Nahavandi

Understanding the behaviors and intentions of humans are one of the main challenges autonomous ground vehicles still faced with. More specifically, when it comes to complex environments such as urban traffic scenes, inferring the intentions and actions of vulnerable road users such as pedestrians become even harder. In this paper, we address the problem of intent action prediction of pedestrians in urban traffic environments using only image sequences from a monocular RGB camera. We propose a real-time framework that can accurately detect, track and predict the intended actions of pedestrians based on a tracking-by-detection technique in conjunction with a novel spatio-temporal DenseNet model. We trained and evaluated our framework based on real data collected from urban traffic environments. Our framework has shown resilient and competitive results in comparison to other baseline approaches. Overall, we achieved an average precision score of 84.76% with a real-time performance at 20 FPS.

NAJul 5, 2017
A Convergent Staggered Scheme for the Variable Density Incompressible Navier-Stokes Equations

Jean-Claude Latché, Khaled Saleh

In this paper, we analyze a scheme for the time-dependent variable density Navier-Stokes equations. The algorithm is implicit in time, and the space approximation is based on a low-order staggered non-conforming finite element, the so-called Rannacher-Turek element. The convection term in the momentum balance equation is discretized by a finite volume technique, in such a way that a solution obeys a discrete kinetic energy balance, and the mass balance is approximated by an upwind finite volume method. We first show that the scheme preserves the stability properties of the continuous problem (L $\infty$-estimate for the density, L $\infty$ (L 2)-and L 2 (H 1)-estimates for the velocity), which yields, by a topological degree technique, the existence of a solution. Then, invoking compactness arguments and passing to the limit in the scheme, we prove that any sequence of solutions (obtained with a sequence of discretizations the space and time step of which tend to zero) converges up to the extraction of a subsequence to a weak solution of the continuous problem.