27.1LGMay 6
Balancing Stability and Plasticity in Sequentially Trained Early-Exiting Neural NetworksAlaa Zniber, Ouassim Karrakchou, Mounir Ghogho
Early-exiting neural networks enable adaptive inference by allowing inputs to exit at intermediate classifiers, reducing computation for easy samples while maintaining high accuracy. In practice, exits can be trained sequentially by incrementally adding them to a shared backbone; however, this sequential training can cause newly introduced exits to interfere with previously learned ones, degrading the performance of earlier classifiers. We address this problem by retaining the knowledge embedded in existing exits while allowing new ones to specialize. We propose two alternative approaches that operate at different levels of the model. The first constrains learning by protecting parameters that are important for previously trained exits, while the second preserves the output distributions of earlier exits as the network adapts. These alternatives directly reflect the stability-plasticity trade-off studied in continual learning. Accordingly, we leverage \textit{Elastic Weight Consolidation} to constrain critical weights and \textit{Learning without Forgetting} to preserve output distributions. Experiments on standard benchmarks show that our approaches consistently improve early-exit performance, achieving higher accuracy over existing sequential training methods and significant performance speedups at low computational budgets.
SDAug 31, 2023
Dynamic nsNet2: Efficient Deep Noise Suppression with Early ExitingRiccardo Miccini, Alaa Zniber, Clément Laroche et al.
Although deep learning has made strides in the field of deep noise suppression, leveraging deep architectures on resource-constrained devices still proved challenging. Therefore, we present an early-exiting model based on nsNet2 that provides several levels of accuracy and resource savings by halting computations at different stages. Moreover, we adapt the original architecture by splitting the information flow to take into account the injected dynamism. We show the trade-offs between performance and computational complexity based on established metrics.
COMP-PHApr 7, 2023
EPINN-NSE: Enhanced Physics-Informed Neural Networks for Solving Navier-Stokes EquationsAyoub Farkane, Mounir Ghogho, Mustapha Oudani et al.
Fluid mechanics is a fundamental field in engineering and science. Solving the Navier-Stokes equation (NSE) is critical for understanding the behavior of fluids. However, the NSE is a complex partial differential equation that is difficult to solve, and classical numerical methods can be computationally expensive. In this paper, we present an innovative approach for solving the NSE using Physics Informed Neural Networks (PINN) and several novel techniques that improve their performance. The first model is based on an assumption that involves approximating the velocity component by employing the derivative of a stream function. This assumption serves to simplify the system and guarantees that the velocity adheres to the divergence-free equation. We also developed a second more flexible model that approximates the solution without any assumptions. The proposed models can effectively solve two-dimensional NSE. Moreover, we successfully applied the second model to solve the three-dimensional NSE. The results show that the models can efficiently and accurately solve the NSE in three dimensions. These approaches offer several advantages, including high trainability, flexibility, and efficiency.
CLSep 28, 2022
Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language ModelsYouness Moukafih, Mounir Ghogho, Kamel Smaili
Recently, Supervised Contrastive Learning (SCL) has been shown to achieve excellent performance in most classification tasks. In SCL, a neural network is trained to optimize two objectives: pull an anchor and positive samples together in the embedding space, and push the anchor apart from the negatives. However, these two different objectives may conflict, requiring trade-offs between them during optimization. In this work, we formulate the SCL problem as a Multi-Objective Optimization problem for the fine-tuning phase of RoBERTa language model. Two methods are utilized to solve the optimization problem: (i) the linear scalarization (LS) method, which minimizes a weighted linear combination of pertask losses; and (ii) the Exact Pareto Optimal (EPO) method which finds the intersection of the Pareto front with a given preference vector. We evaluate our approach on several GLUE benchmark tasks, without using data augmentations, memory banks, or generating adversarial examples. The empirical results show that the proposed learning strategy significantly outperforms a strong competitive contrastive learning baseline
LGAug 31, 2023
Multi-Objective Decision Transformers for Offline Reinforcement LearningAbdelghani Ghanem, Philippe Ciblat, Mounir Ghogho
Offline Reinforcement Learning (RL) is structured to derive policies from static trajectory data without requiring real-time environment interactions. Recent studies have shown the feasibility of framing offline RL as a sequence modeling task, where the sole aim is to predict actions based on prior context using the transformer architecture. However, the limitation of this single task learning approach is its potential to undermine the transformer model's attention mechanism, which should ideally allocate varying attention weights across different tokens in the input context for optimal prediction. To address this, we reformulate offline RL as a multi-objective optimization problem, where the prediction is extended to states and returns. We also highlight a potential flaw in the trajectory representation used for sequence modeling, which could generate inaccuracies when modeling the state and return distributions. This is due to the non-smoothness of the action distribution within the trajectory dictated by the behavioral policy. To mitigate this issue, we introduce action space regions to the trajectory representation. Our experiments on D4RL benchmark locomotion tasks reveal that our propositions allow for more effective utilization of the attention mechanism in the transformer model, resulting in performance that either matches or outperforms current state-of-the art methods.
LGAug 5, 2024
Strategic Federated Learning: Application to Smart Meter Data ClusteringHassan Mohamad, Chao Zhang, Samson Lasaulce et al.
Federated learning (FL) involves several clients that share with a fusion center (FC), the model each client has trained with its own data. Conventional FL, which can be interpreted as an estimation or distortion-based approach, ignores the final use of model information (MI) by the FC and the other clients. In this paper, we introduce a novel FL framework in which the FC uses an aggregate version of the MI to make decisions that affect the client's utility functions. Clients cannot choose the decisions and can only use the MI reported to the FC to maximize their utility. Depending on the alignment between the client and FC utilities, the client may have an individual interest in adding strategic noise to the model. This general framework is stated and specialized to the case of clustering, in which noisy cluster representative information is reported. This is applied to the problem of power consumption scheduling. In this context, utility non-alignment occurs, for instance, when the client wants to consume when the price of electricity is low, whereas the FC wants the consumption to occur when the total power is the lowest. This is illustrated with aggregated real data from Ausgrid \cite{ausgrid}. Our numerical analysis clearly shows that the client can increase his utility by adding noise to the model reported to the FC. Corresponding results and source codes can be downloaded from \cite{source-code}.
CLSep 2, 2024
A multilingual training strategy for low resource Text to SpeechAsma Amalas, Mounir Ghogho, Mohamed Chetouani et al.
Recent speech technologies have led to produce high quality synthesised speech due to recent advances in neural Text to Speech (TTS). However, such TTS models depend on extensive amounts of data that can be costly to produce and is hardly scalable to all existing languages, especially that seldom attention is given to low resource languages. With techniques such as knowledge transfer, the burden of creating datasets can be alleviated. In this paper, we therefore investigate two aspects; firstly, whether data from social media can be used for a small TTS dataset construction, and secondly whether cross lingual transfer learning (TL) for a low resource language can work with this type of data. In this aspect, we specifically assess to what extent multilingual modeling can be leveraged as an alternative to training on monolingual corporas. To do so, we explore how data from foreign languages may be selected and pooled to train a TTS model for a target low resource language. Our findings show that multilingual pre-training is better than monolingual pre-training at increasing the intelligibility and naturalness of the generated speech.
LGJul 29, 2022
KG-NSF: Knowledge Graph Completion with a Negative-Sample-Free ApproachAdil Bahaj, Safae Lhazmir, Mounir Ghogho
Knowledge Graph (KG) completion is an important task that greatly benefits knowledge discovery in many fields (e.g. biomedical research). In recent years, learning KG embeddings to perform this task has received considerable attention. Despite the success of KG embedding methods, they predominantly use negative sampling, resulting in increased computational complexity as well as biased predictions due to the closed world assumption. To overcome these limitations, we propose \textbf{KG-NSF}, a negative sampling-free framework for learning KG embeddings based on the cross-correlation matrices of embedding vectors. It is shown that the proposed method achieves comparable link prediction performance to negative sampling-based methods while converging much faster.
LGSep 5, 2023
Dynamic Early Exiting Predictive Coding Neural NetworksAlaa Zniber, Ouassim Karrakchou, Mounir Ghogho
Internet of Things (IoT) sensors are nowadays heavily utilized in various real-world applications ranging from wearables to smart buildings passing by agrotechnology and health monitoring. With the huge amounts of data generated by these tiny devices, Deep Learning (DL) models have been extensively used to enhance them with intelligent processing. However, with the urge for smaller and more accurate devices, DL models became too heavy to deploy. It is thus necessary to incorporate the hardware's limited resources in the design process. Therefore, inspired by the human brain known for its efficiency and low power consumption, we propose a shallow bidirectional network based on predictive coding theory and dynamic early exiting for halting further computations when a performance threshold is surpassed. We achieve comparable accuracy to VGG-16 in image classification on CIFAR-10 with fewer parameters and less computational complexity.
SIOct 3, 2022
When Infodemic Meets Epidemic: a Systematic Literature ReviewChaimae Asaad, Imane Khaouja, Mounir Ghogho et al.
Epidemics and outbreaks present arduous challenges requiring both individual and communal efforts. Social media offer significant amounts of data that can be leveraged for bio-surveillance. They also provide a platform to quickly and efficiently reach a sizeable percentage of the population, hence their potential impact on various aspects of epidemic mitigation. The general objective of this systematic literature review is to provide a methodical overview of the integration of social media in different epidemic-related contexts. Three research questions were conceptualized for this review, resulting in over 10000 publications collected in the first PRISMA stage, 129 of which were selected for inclusion. A thematic method-oriented synthesis was undertaken and identified 5 main themes related to social media enabled epidemic surveillance, misinformation management, and mental health. Findings uncover a need for more robust applications of the lessons learned from epidemic post-mortem documentation. A vast gap exists between retrospective analysis of epidemic management and result integration in prospective studies. Harnessing the full potential of social media in epidemic related tasks requires streamlining the results of epidemic forecasting, public opinion understanding and misinformation propagation, all while keeping abreast of potential mental health implications. Pro-active prevention has thus become vital for epidemic curtailment and containment.
CRSep 15, 2025
Collaborative P4-SDN DDoS Detection and Mitigation with Early-Exit Neural NetworksOuassim Karrakchou, Alaa Zniber, Anass Sebbar et al.
Distributed Denial of Service (DDoS) attacks pose a persistent threat to network security, requiring timely and scalable mitigation strategies. In this paper, we propose a novel collaborative architecture that integrates a P4-programmable data plane with an SDN control plane to enable real-time DDoS detection and response. At the core of our approach is a split early-exit neural network that performs partial inference in the data plane using a quantized Convolutional Neural Network (CNN), while deferring uncertain cases to a Gated Recurrent Unit (GRU) module in the control plane. This design enables high-speed classification at line rate with the ability to escalate more complex flows for deeper analysis. Experimental evaluation using real-world DDoS datasets demonstrates that our approach achieves high detection accuracy with significantly reduced inference latency and control plane overhead. These results highlight the potential of tightly coupled ML-P4-SDN systems for efficient, adaptive, and low-latency DDoS defense.
AIFeb 15, 2025Code
USER-VLM 360: Personalized Vision Language Models with User-aware Tuning for Social Human-Robot InteractionsHamed Rahimi, Adil Bahaj, Mouad Abrini et al.
The integration of vision-language models into robotic systems constitutes a significant advancement in enabling machines to interact with their surroundings in a more intuitive manner. While VLMs offer rich multimodal reasoning, existing approaches lack user-specific adaptability, often relying on generic interaction paradigms that fail to account for individual behavioral, contextual, or socio-emotional nuances. When customization is attempted, ethical concerns arise from unmitigated biases in user data, risking exclusion or unfair treatment. To address these dual challenges, we propose User-VLM 360°, a holistic framework integrating multimodal user modeling with bias-aware optimization. Our approach features: (1) user-aware tuning that adapts interactions in real time using visual-linguistic signals; (2) bias mitigation via preference optimization; and (3) curated 360° socio-emotive interaction datasets annotated with demographic, emotion, and relational metadata. Evaluations across eight benchmarks demonstrate state-of-the-art results: +35.3% F1 in personalized VQA, +47.5% F1 in facial features understanding, 15% bias reduction, and 30X speedup over baselines. Ablation studies confirm component efficacy, and deployment on the Pepper robot validates real-time adaptability across diverse users. We open-source parameter-efficient 3B/10B models and an ethical verification framework for responsible adaptation.
AISep 24, 2024
AsthmaBot: Multi-modal, Multi-Lingual Retrieval Augmented Generation For Asthma Patient SupportAdil Bahaj, Mounir Ghogho
Asthma rates have risen globally, driven by environmental and lifestyle factors. Access to immediate medical care is limited, particularly in developing countries, necessitating automated support systems. Large Language Models like ChatGPT (Chat Generative Pre-trained Transformer) and Gemini have advanced natural language processing in general and question answering in particular, however, they are prone to producing factually incorrect responses (i.e. hallucinations). Retrieval-augmented generation systems, integrating curated documents, can improve large language models' performance and reduce the incidence of hallucination. We introduce AsthmaBot, a multi-lingual, multi-modal retrieval-augmented generation system for asthma support. Evaluation of an asthma-related frequently asked questions dataset shows AsthmaBot's efficacy. AsthmaBot has an added interactive and intuitive interface that integrates different data modalities (text, images, videos) to make it accessible to the larger public. AsthmaBot is available online via \url{asthmabot.datanets.org}.
CCDec 4, 2025
Hardware-aware Neural Architecture Search of Early Exiting Networks on Edge AcceleratorsAlaa Zniber, Arne Symons, Ouassim Karrakchou et al.
Advancements in high-performance computing and cloud technologies have enabled the development of increasingly sophisticated Deep Learning (DL) models. However, the growing demand for embedded intelligence at the edge imposes stringent computational and energy constraints, challenging the deployment of these large-scale models. Early Exiting Neural Networks (EENN) have emerged as a promising solution, allowing dynamic termination of inference based on input complexity to enhance efficiency. Despite their potential, EENN performance is highly influenced by the heterogeneity of edge accelerators and the constraints imposed by quantization, affecting accuracy, energy efficiency, and latency. Yet, research on the automatic optimization of EENN design for edge hardware remains limited. To bridge this gap, we propose a hardware-aware Neural Architecture Search (NAS) framework that systematically integrates the effects of quantization and hardware resource allocation to optimize the placement of early exit points within a network backbone. Experimental results on the CIFAR-10 dataset demonstrate that our NAS framework can discover architectures that achieve over a 50\% reduction in computational costs compared to conventional static networks, making them more suitable for deployment in resource-constrained edge environments.
CLOct 4, 2025Code
Cross-Lingual Multi-Granularity Framework for Interpretable Parkinson's Disease Diagnosis from SpeechIlias Tougui, Mehdi Zakroum, Mounir Ghogho
Parkinson's Disease (PD) affects over 10 million people worldwide, with speech impairments in up to 89% of patients. Current speech-based detection systems analyze entire utterances, potentially overlooking the diagnostic value of specific phonetic elements. We developed a granularity-aware approach for multilingual PD detection using an automated pipeline that extracts time-aligned phonemes, syllables, and words from recordings. Using Italian, Spanish, and English datasets, we implemented a bidirectional LSTM with multi-head attention to compare diagnostic performance across the different granularity levels. Phoneme-level analysis achieved superior performance with AUROC of 93.78% +- 2.34% and accuracy of 92.17% +- 2.43%. This demonstrates enhanced diagnostic capability for cross-linguistic PD detection. Importantly, attention analysis revealed that the most informative speech features align with those used in established clinical protocols: sustained vowels (/a/, /e/, /o/, /i/) at phoneme level, diadochokinetic syllables (/ta/, /pa/, /la/, /ka/) at syllable level, and /pataka/ sequences at word level. Source code will be available at https://github.com/jetliqs/clearpd.
LGJul 21, 2024
Revisiting Neighborhood Aggregation in Graph Neural Networks for Node Classification using Statistical Signal ProcessingMounir Ghogho
We delve into the issue of node classification within graphs, specifically reevaluating the concept of neighborhood aggregation, which is a fundamental component in graph neural networks (GNNs). Our analysis reveals conceptual flaws within certain benchmark GNN models when operating under the assumption of edge-independent node labels, a condition commonly observed in benchmark graphs employed for node classification. Approaching neighborhood aggregation from a statistical signal processing perspective, our investigation provides novel insights which may be used to design more efficient GNN models.
63.4LGMay 7
Entropy-Regularized Adjoint Matching for Offline RLAbdelghani Ghanem, Mounir Ghogho
Integrating expressive generative policies, such as flow-matching models, into offline reinforcement learning (RL) allows agents to capture complex, multi-modal behaviors. While Q-learning with Adjoint Matching (QAM) stabilizes policy optimization via the continuous adjoint method, it remains inherently bound to the fixed behavior distribution. This dependence induces a \textit{popularity bias} that can suppress high-reward actions in low-density regions, and creates a \textit{support binding} that restricts off-manifold exploration. Existing workarounds, such as appending \textit{residual} Gaussian policies, often re-introduce the expressivity bottlenecks associated with unimodal distributions. In this work, we propose \textit{Maximum Entropy Adjoint Matching} (ME-AM), a unified framework that addresses these limitations within the continuous flow formulation. ME-AM incorporates two mechanisms: (1) a Mirror Descent entropy maximization objective that mitigates the popularity bias to facilitate the extraction of optimal policies from offline datasets, and (2) a \textit{Mixture Behavior Prior} that mathematically broadens the geometric support to encompass out-of-distribution high-reward regions. By exploring this extended geometry, ME-AM identifies robust actions while preserving the absolute continuity of the generative vector field. Empirically, ME-AM demonstrates competitive or superior performance compared to prior state-of-the-art (SOTA) methods across a diverse suite of sparse-reward continuous control environments.
CVMar 6, 2024
Redefining cystoscopy with ai: bladder cancer diagnosis using an efficient hybrid cnn-transformer modelMeryem Amaouche, Ouassim Karrakchou, Mounir Ghogho et al.
Bladder cancer ranks within the top 10 most diagnosed cancers worldwide and is among the most expensive cancers to treat due to the high recurrence rates which require lifetime follow-ups. The primary tool for diagnosis is cystoscopy, which heavily relies on doctors' expertise and interpretation. Therefore, annually, numerous cases are either undiagnosed or misdiagnosed and treated as urinary infections. To address this, we suggest a deep learning approach for bladder cancer detection and segmentation which combines CNNs with a lightweight positional-encoding-free transformer and dual attention gates that fuse self and spatial attention for feature enhancement. The architecture suggested in this paper is efficient making it suitable for medical scenarios that require real time inference. Experiments have proven that this model addresses the critical need for a balance between computational efficiency and diagnostic accuracy in cystoscopic imaging as despite its small size it rivals large models in performance.
CLAug 22, 2025
MizanQA: Benchmarking Large Language Models on Moroccan Legal Question AnsweringAdil Bahaj, Mounir Ghogho
The rapid advancement of large language models (LLMs) has significantly propelled progress in natural language processing (NLP). However, their effectiveness in specialized, low-resource domains-such as Arabic legal contexts-remains limited. This paper introduces MizanQA (pronounced Mizan, meaning "scale" in Arabic, a universal symbol of justice), a benchmark designed to evaluate LLMs on Moroccan legal question answering (QA) tasks, characterised by rich linguistic and legal complexity. The dataset draws on Modern Standard Arabic, Islamic Maliki jurisprudence, Moroccan customary law, and French legal influences. Comprising over 1,700 multiple-choice questions, including multi-answer formats, MizanQA captures the nuances of authentic legal reasoning. Benchmarking experiments with multilingual and Arabic-focused LLMs reveal substantial performance gaps, highlighting the need for tailored evaluation metrics and culturally grounded, domain-specific LLM development.
LGJan 3, 2024
Applications of machine learning and IoT for Outdoor Air Pollution Monitoring and Prediction: A Systematic Literature ReviewIhsane Gryech, Chaimae Assad, Mounir Ghogho et al.
According to the World Health Organization (WHO), air pollution kills seven million people every year. Outdoor air pollution is a major environmental health problem affecting low, middle, and high-income countries. In the past few years, the research community has explored IoT-enabled machine learning applications for outdoor air pollution prediction. The general objective of this paper is to systematically review applications of machine learning and Internet of Things (IoT) for outdoor air pollution prediction and the combination of monitoring sensors and input features used. Two research questions were formulated for this review. 1086 publications were collected in the initial PRISMA stage. After the screening and eligibility phases, 37 papers were selected for inclusion. A cost-based analysis was conducted on the findings to highlight high-cost monitoring, low-cost IoT and hybrid enabled prediction. Three methods of prediction were identified: time series, feature-based and spatio-temporal. This review's findings identify major limitations in applications found in the literature, namely lack of coverage, lack of diversity of data and lack of inclusion of context-specific features. This review proposes directions for future research and underlines practical implications in healthcare, urban planning, global synergy and smart cities.
LGSep 22, 2025
Confidence-gated training for efficient early-exit neural networksSaad Mokssit, Ouassim Karrakchou, Alejandro Mousist et al.
Early-exit neural networks reduce inference cost by enabling confident predictions at intermediate layers. However, joint training often leads to gradient interference, with deeper classifiers dominating optimization. We propose Confidence-Gated Training (CGT), a paradigm that conditionally propagates gradients from deeper exits only when preceding exits fail. This encourages shallow classifiers to act as primary decision points while reserving deeper layers for harder inputs. By aligning training with the inference-time policy, CGT mitigates overthinking, improves early-exit accuracy, and preserves efficiency. Experiments on the Indian Pines and Fashion-MNIST benchmarks show that CGT lowers average inference cost while improving overall accuracy, offering a practical solution for deploying deep models in resource-constrained environments.
CYAug 22, 2025
PediatricsMQA: a Multi-modal Pediatrics Question Answering BenchmarkAdil Bahaj, Oumaima Fadi, Mohamed Chetouani et al.
Large language models (LLMs) and vision-augmented LLMs (VLMs) have significantly advanced medical informatics, diagnostics, and decision support. However, these models exhibit systematic biases, particularly age bias, compromising their reliability and equity. This is evident in their poorer performance on pediatric-focused text and visual question-answering tasks. This bias reflects a broader imbalance in medical research, where pediatric studies receive less funding and representation despite the significant disease burden in children. To address these issues, a new comprehensive multi-modal pediatric question-answering benchmark, PediatricsMQA, has been introduced. It consists of 3,417 text-based multiple-choice questions (MCQs) covering 131 pediatric topics across seven developmental stages (prenatal to adolescent) and 2,067 vision-based MCQs using 634 pediatric images from 67 imaging modalities and 256 anatomical regions. The dataset was developed using a hybrid manual-automatic pipeline, incorporating peer-reviewed pediatric literature, validated question banks, existing benchmarks, and existing QA resources. Evaluating state-of-the-art open models, we find dramatic performance drops in younger cohorts, highlighting the need for age-aware methods to ensure equitable AI support in pediatric care.
SDJul 26, 2025
Improving Deep Learning-based Respiratory Sound Analysis with Frequency Selection and Attention MechanismNouhaila Fraihi, Ouassim Karrakchou, Mounir Ghogho
Accurate classification of respiratory sounds requires deep learning models that effectively capture fine-grained acoustic features and long-range temporal dependencies. Convolutional Neural Networks (CNNs) are well-suited for extracting local time-frequency patterns but are limited in modeling global context. In contrast, transformer-based models can capture long-range dependencies, albeit with higher computational demands. To address these limitations, we propose a compact CNN-Temporal Self-Attention (CNN-TSA) network that integrates lightweight self-attention into an efficient CNN backbone. Central to our approach is a Frequency Band Selection (FBS) module that suppresses noisy and non-informative frequency regions, substantially improving accuracy and reducing FLOPs by up to 50%. We also introduce age-specific models to enhance robustness across diverse patient groups. Evaluated on the SPRSound-2022/2023 and ICBHI-2017 lung sound datasets, CNN-TSA with FBS sets new benchmarks on SPRSound and achieves state-of-the-art performance on ICBHI, all with a significantly smaller computational footprint. Furthermore, integrating FBS into an existing transformer baseline yields a new record on ICBHI, confirming FBS as an effective drop-in enhancement. These results demonstrate that our framework enables reliable, real-time respiratory sound analysis suitable for deployment in resource-constrained settings.
DSJul 9, 2025
Designing Robust Software Sensors for Nonlinear Systems via Neural Networks and Adaptive Sliding Mode ControlAyoub Farkane, Mohamed Boutayeb, Mustapha Oudani et al.
Accurate knowledge of the state variables in a dynamical system is critical for effective control, diagnosis, and supervision, especially when direct measurements of all states are infeasible. This paper presents a novel approach to designing software sensors for nonlinear dynamical systems expressed in their most general form. Unlike traditional model-based observers that rely on explicit transformations or linearization, the proposed framework integrates neural networks with adaptive Sliding Mode Control (SMC) to design a robust state observer under a less restrictive set of conditions. The learning process is driven by available sensor measurements, which are used to correct the observer's state estimate. The training methodology leverages the system's governing equations as a physics-based constraint, enabling observer synthesis without access to ground-truth state trajectories. By employing a time-varying gain matrix dynamically adjusted by the neural network, the observer adapts in real-time to system changes, ensuring robustness against noise, external disturbances, and variations in system dynamics. Furthermore, we provide sufficient conditions to guarantee estimation error convergence, establishing a theoretical foundation for the observer's reliability. The methodology's effectiveness is validated through simulations on challenging examples, including systems with non-differentiable dynamics and varying observability conditions. These examples, which are often problematic for conventional techniques, serve to demonstrate the robustness and broad applicability of our approach. The results show rapid convergence and high accuracy, underscoring the method's potential for addressing complex state estimation challenges in real-world applications.
LGJul 9, 2025
PINN-Obs: Physics-Informed Neural Network-Based Observer for Nonlinear Dynamical SystemsAyoub Farkane, Mohamed Boutayeb, Mustapha Oudani et al.
State estimation for nonlinear dynamical systems is a critical challenge in control and engineering applications, particularly when only partial and noisy measurements are available. This paper introduces a novel Adaptive Physics-Informed Neural Network-based Observer (PINN-Obs) for accurate state estimation in nonlinear systems. Unlike traditional model-based observers, which require explicit system transformations or linearization, the proposed framework directly integrates system dynamics and sensor data into a physics-informed learning process. The observer adaptively learns an optimal gain matrix, ensuring convergence of the estimated states to the true system states. A rigorous theoretical analysis establishes formal convergence guarantees, demonstrating that the proposed approach achieves uniform error minimization under mild observability conditions. The effectiveness of PINN-Obs is validated through extensive numerical simulations on diverse nonlinear systems, including an induction motor model, a satellite motion system, and benchmark academic examples. Comparative experimental studies against existing observer designs highlight its superior accuracy, robustness, and adaptability.
CLApr 16, 2025
Gauging Overprecision in LLMs: An Empirical StudyAdil Bahaj, Hamed Rahimi, Mohamed Chetouani et al.
Recently, overconfidence in large language models (LLMs) has garnered considerable attention due to its fundamental importance in quantifying the trustworthiness of LLM generation. However, existing approaches prompt the \textit{black box LLMs} to produce their confidence (\textit{verbalized confidence}), which can be subject to many biases and hallucinations. Inspired by a different aspect of overconfidence in cognitive science called \textit{overprecision}, we designed a framework for its study in black box LLMs. This framework contains three main phases: 1) generation, 2) refinement and 3) evaluation. In the generation phase we prompt the LLM to generate answers to numerical questions in the form of intervals with a certain level of confidence. This confidence level is imposed in the prompt and not required for the LLM to generate as in previous approaches. We use various prompting techniques and use the same prompt multiple times to gauge the effects of randomness in the generation process. In the refinement phase, answers from the previous phase are refined to generate better answers. The LLM answers are evaluated and studied in the evaluation phase to understand its internal workings. This study allowed us to gain various insights into LLM overprecision: 1) LLMs are highly uncalibrated for numerical tasks 2) there is no correlation between the length of the interval and the imposed confidence level, which can be symptomatic of a a) lack of understanding of the concept of confidence or b) inability to adjust self-confidence by following instructions, {3) LLM numerical precision differs depending on the task, scale of answer and prompting technique 4) Refinement of answers doesn't improve precision in most cases. We believe this study offers new perspectives on LLM overconfidence and serves as a strong baseline for overprecision in LLMs.
CRNov 19, 2024
STRisk: A Socio-Technical Approach to Assess Hacking Breaches RiskHicham Hammouchi, Narjisse Nejjari, Ghita Mezzour et al.
Data breaches have begun to take on new dimensions and their prediction is becoming of great importance to organizations. Prior work has addressed this issue mainly from a technical perspective and neglected other interfering aspects such as the social media dimension. To fill this gap, we propose STRisk which is a predictive system where we expand the scope of the prediction task by bringing into play the social media dimension. We study over 3800 US organizations including both victim and non-victim organizations. For each organization, we design a profile composed of a variety of externally measured technical indicators and social factors. In addition, to account for unreported incidents, we consider the non-victim sample to be noisy and propose a noise correction approach to correct mislabeled organizations. We then build several machine learning models to predict whether an organization is exposed to experience a hacking breach. By exploiting both technical and social features, we achieve a Area Under Curve (AUC) score exceeding 98%, which is 12% higher than the AUC achieved using only technical features. Furthermore, our feature importance analysis reveals that open ports and expired certificates are the best technical predictors, while spreadability and agreeability are the best social predictors.
SPFeb 1, 2022
Experimental Investigation of Variational Mode Decomposition and Deep Learning for Short-Term Multi-horizon Residential Electric Load ForecastingMohamed Aymane Ahajjam, Daniel Bonilla Licea, Mounir Ghogho et al.
With the booming growth of advanced digital technologies, it has become possible for users as well as distributors of energy to obtain detailed and timely information about the electricity consumption of households. These technologies can also be used to forecast the household's electricity consumption (a.k.a. the load). In this paper, we investigate the use of Variational Mode Decomposition and deep learning techniques to improve the accuracy of the load forecasting problem. Although this problem has been studied in the literature, selecting an appropriate decomposition level and a deep learning technique providing better forecasting performance have garnered comparatively less attention. This study bridges this gap by studying the effect of six decomposition levels and five distinct deep learning networks. The raw load profiles are first decomposed into intrinsic mode functions using the Variational Mode Decomposition in order to mitigate their non-stationary aspect. Then, day, hour, and past electricity consumption data are fed as a three-dimensional input sequence to a four-level Wavelet Decomposition Network model. Finally, the forecast sequences related to the different intrinsic mode functions are combined to form the aggregate forecast sequence. The proposed method was assessed using load profiles of five Moroccan households from the Moroccan buildings' electricity consumption dataset (MORED) and was benchmarked against state-of-the-art time-series models and a baseline persistence model.
CRSep 3, 2021
Leveraging Open Threat Exchange (OTX) to Understand Spatio-Temporal Trends of Cyber Threats: Covid-19 Case StudyOthmane Cherqi, Hicham Hammouchi, Mounir Ghogho et al.
Understanding the properties exhibited by Spatial-temporal evolution of cyber attacks improve cyber threat intelligence. In addition, better understanding on threats patterns is a key feature for cyber threats prevention, detection, and management and for enhancing defenses. In this work, we study different aspects of emerging threats in the wild shared by 160,000 global participants form all industries. First, we perform an exploratory data analysis of the collected cyber threats. We investigate the most targeted countries, most common malwares and the distribution of attacks frequency by localisation. Second, we extract attacks' spreading patterns at country level. We model these behaviors using transition graphs decorated with probabilities of switching from a country to another. Finally, we analyse the extent to which cyber threats have been affected by the COVID-19 outbreak and sanitary measures imposed by governments to prevent the virus from spreading.
LGJul 15, 2020
GraphCL: Contrastive Self-Supervised Learning of Graph RepresentationsHakim Hafidi, Mounir Ghogho, Philippe Ciblat et al.
We propose Graph Contrastive Learning (GraphCL), a general framework for learning node representations in a self supervised manner. GraphCL learns node embeddings by maximizing the similarity between the representations of two randomly perturbed versions of the intrinsic features and link structure of the same node's local subgraph. We use graph neural networks to produce two representations of the same node and leverage a contrastive learning loss to maximize agreement between them. In both transductive and inductive learning setups, we demonstrate that our approach significantly outperforms the state-of-the-art in unsupervised learning on a number of node classification benchmarks.
CVDec 23, 2018
Deep Learning for Inferring the Surface Solar Irradiance from Sky ImageryMehdi Zakroum, Mounir Ghogho, Mustapha Faqir et al.
We present a novel approach to perform ground-based estimation and prediction of the surface solar irradiance with the view to predicting photovoltaic energy production. We propose the use of mini-batch k-means clustering to extract features, referred to as per cluster number of pixels (PCNP), from sky images taken by a low-cost fish eye camera. These features are first used to classify the sky as clear or cloudy using a single hidden layer neural network; the classification accuracy achieves 99.7%. If the sky is classified as cloudy, we propose to use a deep neural network having as input features the PCNP to predict intra-hour variability of the solar irradiance. Toward this objective, in this paper, we focus on estimating the deep neural network model relating the PCNP features and the solar irradiance, which is an important step before performing the prediction task. The proposed deep learning-based estimation approach is shown to have an accuracy of 95%.
CRDec 23, 2018
Exploratory Data Analysis of a Network Telescope Traffic and Prediction of Port Probing RatesMehdi Zakroum, Abdellah Houmz, Mounir Ghogho et al.
Understanding the properties exhibited by large scale network probing traffic would improve cyber threat intelligence. In addition, the prediction of probing rates is a key feature for security practitioners in their endeavors for making better operational decisions and for enhancing their defense strategy skills. In this work, we study different aspects of the traffic captured by a /20 network telescope. First, we perform an exploratory data analysis of the collected probing activities. The investigation includes probing rates at the port level, services interesting top network probers and the distribution of probing rates by geolocation. Second, we extract the network probers exploration patterns. We model these behaviors using transition graphs decorated with probabilities of switching from a port to another. Finally, we assess the capacity of Non-stationary Autoregressive and Vector Autoregressive models in predicting port probing rates as a first step towards using more robust models for better forecasting performance.
ROMay 21, 2018
Robotic Mobility Diversity Algorithm with Continuous Search SpaceDaniel Bonilla Licea, Des McLernon, Mounir Ghogho et al.
Small scale fading makes the wireless channel gain vary significantly over small distances and in the context of classical communication systems it can be detrimental to performance. But in the context of mobile robot (MR) wireless communications, we can take advantage of the fading using a mobility diversity algorithm (MDA) to deliberately locate the MR at a point where the channel gain is high. There are two classes of MDAs. In the first class, the MR explores various points, stops at each one to collect channel measurements and then locates the best position to establish communications. In the second class the MR moves, without stopping, along a continuous path while collecting channel measurements and then stops at the end of the path. It determines the best point to establish communications. Until now, the shape of the continuous path for such MDAs has been arbitrarily selected and currently there is no method to optimize it. In this paper, we propose a method to optimize such a path. Simulation results show that such optimized paths provide the MDAs with an increased performance, enabling them to experience higher channel gains while using less mechanical energy for the MR motion.
SYJun 3, 2015
Distributed Optimal Quantization and Power Allocation for Sensor Detection Via ConsensusEdmond Nurellari, Des McLernon, Mounir Ghogho et al.
We address the optimal transmit power allocation problem (from the sensor nodes (SNs) to the fusion center (FC)) for the decentralized detection of an unknown deterministic spatially uncorrelated signal which is being observed by a distributed wireless sensor network. We propose a novel fully distributed algorithm, in order to calculate the optimal transmit power allocation for each sensor node (SN) and the optimal number of quantization bits for the test statistic in order to match the channel capacity. The SNs send their quantized information over orthogonal uncorrelated channels to the FC which linearly combines them and makes a final decision. What makes this scheme attractive is that the SNs share with their neighbours just their individual transmit powers at the current states. As a result, the SN processing complexity is further reduced.