Marcellin Atemkeng

LG
h-index18
17papers
16citations
Novelty26%
AI Score46

17 Papers

LGMay 26
Few-shot Cross-country Generalization of Tabular Machine Learning and Foundation Models for Childhood Anemia Prediction under Distribution Shift

Yusuf Brima, Marcellin Atemkeng, Lansana Hassim Kallon et al.

Childhood anemia affects around 40% of children aged 6-59 months globally and arises from heterogeneous factors, limiting model generalizability. We evaluate a transformer-based tabular foundation model against classical supervised methods under cross-country and data-scarce settings. We used DHS data from 16 countries across Africa, Asia, Latin America, the Caucasus, and the Middle East (n=68,856). We compared Logistic Regression, XGBoost, LightGBM, and TabPFN v2.6. Performance was assessed using AUC-ROC, Brier score, and ECE. Generalization was evaluated using leave-one-country-out (LOCO), reverse-LOCO, and few-shot settings. Subgroup analyses included sex, age, residence, maternal education, and wealth. Feature importance was estimated using SHAP. TabPFN outperformed classical models in low-data regimes (<200 samples), showing higher discrimination and better calibration. Across countries, it achieved the lowest Brier score (0.042) and ECE (0.203). Under full-data settings, AUC-ROC ranged from 0.59-0.76 with small between-model differences ($\leq 0.05$). LOCO performance was stable (0.58-0.69), driven by country context. Reverse-LOCO showed asymmetric transferability. Subgroup performance was consistent with no systematic demographic bias. SHAP identified child age, altitude, and height-for-age z-score as dominant predictors, followed by wealth and maternal education. Performance in childhood anemia prediction is driven more by population variation than model choice. TabPFN provides advantages in low-resource settings through improved discrimination and calibration, highlighting foundation models as promising tools for data-scarce global health prediction.

LGFeb 6, 2023
Label Assisted Autoencoder for Anomaly Detection in Power Generation Plants

Marcellin Atemkeng, Victor Osanyindoro, Rockefeller Rockefeller et al.

One of the critical factors that drive the economic development of a country and guarantee the sustainability of its industries is the constant availability of electricity. This is usually provided by the national electric grid. However, in developing countries where companies are emerging on a constant basis including telecommunication industries, those are still experiencing a non-stable electricity supply. Therefore, they have to rely on generators to guarantee their full functionality. Those generators depend on fuel to function and the rate of consumption gets usually high, if not monitored properly. Monitoring operation is usually carried out by a (non-expert) human. In some cases, this could be a tedious process, as some companies have reported an exaggerated high consumption rate. This work proposes a label assisted autoencoder for anomaly detection in the fuel consumed by power generating plants. In addition to the autoencoder model, we added a labelling assistance module that checks if an observation is labelled, the label is used to check the veracity of the corresponding anomaly classification given a threshold. A consensus is then reached on whether training should stop or whether the threshold should be updated or the training should continue with the search for hyper-parameters. Results show that the proposed model is highly efficient for reading anomalies with a detection accuracy of $97.20\%$ which outperforms the existing model of $96.1\%$ accuracy trained on the same dataset. In addition, the proposed model is able to classify the anomalies according to their degree of severity.

LGAug 1, 2022
Visual Interpretable and Explainable Deep Learning Models for Brain Tumor MRI and COVID-19 Chest X-ray Images

Yusuf Brima, Marcellin Atemkeng

Deep learning shows promise for medical image analysis but lacks interpretability, hindering adoption in healthcare. Attribution techniques that explain model reasoning may increase trust in deep learning among clinical stakeholders. This paper aimed to evaluate attribution methods for illuminating how deep neural networks analyze medical images. Using adaptive path-based gradient integration, we attributed predictions from brain tumor MRI and COVID-19 chest X-ray datasets made by recent deep convolutional neural network models. The technique highlighted possible biomarkers, exposed model biases, and offered insights into the links between input and prediction. Our analysis demonstrates the method's ability to elucidate model reasoning on these datasets. The resulting attributions show promise for improving deep learning transparency for domain experts by revealing the rationale behind predictions. This study advances model interpretability to increase trust in deep learning among healthcare stakeholders.

LGSep 30, 2023
Anomaly Detection in Power Generation Plants with Generative Adversarial Networks

Marcellin Atemkeng, Toheeb Aduramomi Jimoh

Anomaly detection is a critical task that involves the identification of data points that deviate from a predefined pattern, useful for fraud detection and related activities. Various techniques are employed for anomaly detection, but recent research indicates that deep learning methods, with their ability to discern intricate data patterns, are well-suited for this task. This study explores the use of Generative Adversarial Networks (GANs) for anomaly detection in power generation plants. The dataset used in this investigation comprises fuel consumption records obtained from power generation plants operated by a telecommunications company. The data was initially collected in response to observed irregularities in the fuel consumption patterns of the generating sets situated at the company's base stations. The dataset was divided into anomalous and normal data points based on specific variables, with 64.88% classified as normal and 35.12% as anomalous. An analysis of feature importance, employing the random forest classifier, revealed that Running Time Per Day exhibited the highest relative importance. A GANs model was trained and fine-tuned both with and without data augmentation, with the goal of increasing the dataset size to enhance performance. The generator model consisted of five dense layers using the tanh activation function, while the discriminator comprised six dense layers, each integrated with a dropout layer to prevent overfitting. Following data augmentation, the model achieved an accuracy rate of 98.99%, compared to 66.45% before augmentation. This demonstrates that the model nearly perfectly classified data points into normal and anomalous categories, with the augmented data significantly enhancing the GANs' performance in anomaly detection. Consequently, this study recommends the use of GANs, particularly when using large datasets, for effective anomaly detection.

LGMar 19
Balancing Performance and Fairness in Explainable AI for Anomaly Detection in Distributed Power Plants Monitoring

Corneille Niyonkuru, Marcellin Atemkeng, Gabin Maxime Nguegnang et al.

Reliable anomaly detection in distributed power plant monitoring systems is essential for ensuring operational continuity and reducing maintenance costs, particularly in regions where telecom operators heavily rely on diesel generators. However, this task is challenged by extreme class imbalance, lack of interpretability, and potential fairness issues across regional clusters. In this work, we propose a supervised ML framework that integrates ensemble methods (LightGBM, XGBoost, Random Forest, CatBoost, GBDT, AdaBoost) and baseline models (Support Vector Machine, K-Nearrest Neighbors, Multilayer Perceptrons, and Logistic Regression) with advanced resampling techniques (SMOTE with Tomek Links and ENN) to address imbalance in a dataset of diesel generator operations in Cameroon. Interpretability is achieved through SHAP (SHapley Additive exPlanations), while fairness is quantified using the Disparate Impact Ratio (DIR) across operational clusters. We further evaluate model generalization using Maximum Mean Discrepancy (MMD) to capture domain shifts between regions. Experimental results show that ensemble models consistently outperform baselines, with LightGBM achieving an F1-score of 0.99 and minimal bias across clusters (DIR $\approx 0.95$). SHAP analysis highlights fuel consumption rate and runtime per day as dominant predictors, providing actionable insights for operators. Our findings demonstrate that it is possible to balance performance, interpretability, and fairness in anomaly detection, paving the way for more equitable and explainable AI systems in industrial power management. {\color{black} Finally, beyond offline evaluation, we also discuss how the trained models can be deployed in practice for real-time monitoring. We show how containerized services can process in real-time, deliver low-latency predictions, and provide interpretable outputs for operators.

LGMar 19
An Optimised Greedy-Weighted Ensemble Framework for Financial Loan Default Prediction

Ezekiel Nii Noye Nortey, Jones Asante-Koranteng, Marcellin Atemkeng et al.

Accurate prediction of loan defaults is a central challenge in credit risk management, particularly in modern financial datasets characterised by nonlinear relationships, class imbalance, and evolving borrower behaviour. Traditional statistical models and static ensemble methods often struggle to maintain reliable performance under such conditions. This study proposes an Optimised Greedy-Weighted Ensemble framework for loan default prediction that dynamically allocates model weights based on empirical predictive performance. The framework integrates multiple machine learning classifiers, with their hyperparameters first optimised using Particle Swarm Optimisation. Model predictions are then combined via a regularised greedy weighting mechanism. At the same time, a neural-network-based meta-learner is employed within stacked-ensemble to capture higher-order relationships among model outputs. Experiments conducted on the Lending Club dataset demonstrate that the proposed framework improves predictive performance compared with individual classifiers. The BlendNet ensemble achieved the strongest results with an AUC of 0.80, a macro-average F1-score of 0.73, and a default recall of 0.81. Calibration analysis further shows that tree-based ensembles such as Extra Trees and Gradient Boosting provide the most reliable probability estimates, while the stacked ensemble offers superior ranking capability. Feature analysis using Recursive Feature Elimination identifies revolving utilisation, annual income, and debt-to-income ratio as the most influential predictors of loan default. These findings demonstrate that performance-driven ensemble weighting can improve both predictive accuracy and interpretability in credit risk modelling. The proposed framework provides a scalable data-driven approach to support institutional credit assessment, risk monitoring, and financial decision-making.

LGDec 25, 2025
Robustness and Scalability Of Machine Learning for Imbalanced Clinical Data in Emergency and Critical Care

Yusuf Brima, Marcellin Atemkeng

Emergency and intensive care environments require predictive models that are both accurate and computationally efficient, yet clinical data in these settings are often severely imbalanced. Such skewness undermines model reliability, particularly for rare but clinically crucial outcomes, making robustness and scalability essential for real-world usage. In this paper, we systematically evaluate the robustness and scalability of classical machine learning models on imbalanced tabular data from MIMIC-IV-ED and eICU. Class imbalance was quantified using complementary metrics, and we compared the performance of tree-based methods, the state-of-the-art TabNet deep learning model, and a custom lightweight residual network. TabResNet was designed as a computationally efficient alternative to TabNet, replacing its complex attention mechanisms with a streamlined residual architecture to maintain representational capacity for real-time clinical use. All models were optimized via a Bayesian hyperparameter search and assessed on predictive performance, robustness to increasing imbalance, and computational scalability. Our results, on seven clinically vital predictive tasks, show that tree-based methods, particularly XGBoost, consistently achieved the most stable performance across imbalance levels and scaled efficiently with sample size. Deep tabular models degraded more sharply under imbalance and incurred higher computational costs, while TabResNet provided a lighter alternative to TabNet but did not surpass ensemble benchmarks. These findings indicate that in emergency and critical care, robustness to imbalance and computational scalability could outweigh architectural complexity. Tree-based ensemble methods currently offer the most practical and clinically feasible choice, equipping practitioners with a framework for selecting models suited to high-stakes, time-sensitive environments.

CVDec 22, 2022
Creating awareness about security and safety on highways to mitigate wildlife-vehicle collisions by detecting and recognizing wildlife fences using deep learning and drone technology

Irene Nandutu, Marcellin Atemkeng, Patrice Okouma et al.

In South Africa, it is a common practice for people to leave their vehicles beside the road when traveling long distances for a short comfort break. This practice might increase human encounters with wildlife, threatening their security and safety. Here we intend to create awareness about wildlife fencing, using drone technology and computer vision algorithms to recognize and detect wildlife fences and associated features. We collected data at Amakhala and Lalibela private game reserves in the Eastern Cape, South Africa. We used wildlife electric fence data containing single and double fences for the classification task. Additionally, we used aerial and still annotated images extracted from the drone and still cameras for the segmentation and detection tasks. The model training results from the drone camera outperformed those from the still camera. Generally, poor model performance is attributed to (1) over-decompression of images and (2) the ability of drone cameras to capture more details on images for the machine learning model to learn as compared to still cameras that capture only the front view of the wildlife fence. We argue that our model can be deployed on client-edge devices to inform people about the presence and significance of wildlife fencing, which minimizes human encounters with wildlife, thereby mitigating wildlife-vehicle collisions.

CVMay 7
Bridging visual saliency and large language models for explainable deep learning in medical imaging

Paul Valery Nguezet, Elie Tagne Fute, Yusuf Brima et al.

The opaque nature of deep learning models remains a significant barrier to their clinical adoption in medical imaging. This paper presents a multimodal explainability framework that bridges the gap between convolutional neural network (CNN) predictions and clinically actionable insights for brain tumor classification, leveraging large language models (LLMs) to deliver human-interpretable diagnostic narratives. The proposed framework operates through three coupled stages. First, nine CNN architectures are extended with a dual-output hybrid formulation that simultaneously optimises a classification head and a segmentation head, enabling spatially richer feature learning. Second, visual saliency attribution methods, namely Grad-CAM, Grad-CAM++, and ScoreCAM, are applied to generate class-discriminative heatmaps, which are subsequently refined into binary tumor masks via an adaptive percentile thresholding pipeline. Third, the resulting masks are mapped onto the Harvard-Oxford cortical atlas to translate pixel-level evidence into named neuroanatomical structures, and the extracted findings are encoded into a structured JSON file that conditions three LLMs (Grok3, Mistral, and LLaMA) to generate coherent, radiological-style diagnostic reports. Evaluated on a dataset of 4,834 contrast-enhanced T1-weighted brain MRI images spanning three tumor classes, InceptionResNetV2 achieved the highest classification performance and Grad-CAM++ yielded the best segmentation overlap. Among the language models, Grok3 led in lexical diversity and coherence, while LLaMA achieved the highest readability score. By integrating visual, anatomical, and linguistic modalities into a unified pipeline, the framework produces explanations that are technically grounded and meaningfully interpretable, advancing the transparency and clinical accountability of artificial intelligence assisted brain tumor diagnosis.

AIApr 29
Unsupervised Electrofacies Classification and Porosity Characterization in the Offshore Keta Basin Using Wireline Logs

Hamdiya Adams, Theophilus Ansah-Narh, Daniel Kwadwo Asiedu et al.

This study presents an unsupervised machine learning workflow for electrofacies analysis in the offshore Keta Basin, Ghana, where core data are scarce. Six standard wireline logs from Well~C were analysed over a depth interval comprising approximately $11{,}195$ samples. K-means clustering was applied in multivariate log space, with the clustering structure evaluated using inertia and silhouette diagnostics. Four clusters were identified, supported by an average silhouette coefficient of approximately $0.50$, indicating moderate but meaningful separation. The resulting electrofacies exhibit systematic, depth-continuous patterns associated with variations in clay content, porosity, and rock framework properties, forming a geological continuum from shale-dominated to cleaner sandstone-dominated units. The results demonstrate that log-only, unsupervised clustering supported by quantitative metrics provides a robust and reproducible framework for subsurface characterisation. The proposed workflow offers a practical tool for early-stage formation evaluation in frontier offshore basins and a foundation for future integrated studies.

LGApr 29
Anomaly Detection in Soil Heavy Metal Contamination Using Unsupervised Learning for Environmental Risk Assessment

Isaac Tettey Adjokatse, Samuel Senyo Koranteng, George Yamoah Afrifa et al.

Soil contamination by heavy metals poses a persistent environmental and public health concern in rapidly urbanising regions of Ghana, particularly at unregulated waste disposal sites. This study applies an unsupervised machine learning framework to detect and characterise anomalous heavy metal contamination patterns in soils from twelve waste sites and residential controls in the Central Region, of Ghana. Concentrations of eight metals (As, Cd, Cr, Cu, Hg, Ni, Pb, Zn) were analysed alongside standard health risk indices, including the Hazard Index (HI) and Incremental Lifetime Cancer Risk (ILCR). Isolation Forest and PCA reconstruction error each identified $12$ anomalous samples ($15.4\%$ of $78$ samples), while DBSCAN detected no density-isolated noise points. A consensus approach isolated six robust anomalies ($7.7\%)$, all spatially concentrated at a single site (S3). Anomalies exhibited approximately $70$--$80\%$ higher mean HI values than normal samples, with all consensus anomalies exceeding the HI$=1$ threshold. PCA reconstruction error showed a strong positive association with HI ($r \approx 0.8$), indicating consistency between multivariate deviation and health risk. Three distinct anomaly types were identified: extreme Cu enrichment at S3, anomalously low Ni at S4/S5, and moderate multi-metal (Pb--Zn) co-elevation at S9--S12. The results demonstrate that unsupervised machine learning provides granular, objective insight beyond aggregate indices, enabling targeted site prioritisation and risk-informed environmental management.

CVApr 25
Hierarchical Spatio-Channel Clustering for Efficient Model Compression in Medical Image Analysis

Sisipho Hamlomo, Marcellin Atemkeng, Habte Tadesse Likassa et al.

Convolutional neural networks (CNNs) have become increasingly difficult to deploy in resource-constrained environments due to their large memory and computational requirements. Although low-rank compression methods can reduce this burden, most existing approaches compress spatial and channel redundancy independently and therefore do not fully exploit the localised structure within convolutional feature maps. This paper proposes a hierarchical spatio-channel low-rank compression framework for CNNs that exploits redundancy across spatial regions and channel activations. Unlike conventional methods, which apply a uniform decomposition across an entire layer, the proposed approach first partitions feature maps into spatial regions, then groups channels according to their co-activation patterns within each region, and finally applies rank-adaptive SVD to each resulting spatio-channel cluster. The method is evaluated on an AlexNet-based brain tumour MRI classification model and compared with Global SVD and Tucker decomposition under \(3\times\) and \(6\times\) compression budgets. Our method outperforms both baselines, reducing FLOPs from \(8.21\,\mathrm{G}\) to \(1.55\,\mathrm{G}\) (\(81.1\%\) reduction), achieving a \(1.38\times\) inference speed-up, and increasing classification accuracy from \(87.76\%\) to \(89.80\%\). The method also improves the macro \(F_1\)-score and performance on challenging classes such as meningioma. A hyper-parameter trade-off analysis demonstrates that the framework provides Pareto-optimal configurations, enabling control over the balance between compression and predictive performance. Moderate clustering with adaptive rank selection yields strong results. Bootstrap standard errors are reported for all classification metrics.

IVFeb 21, 2024
A Systematic Review of Low-Rank and Local Low-Rank Matrix Approximation in Big Data Medical Imaging

Sisipho Hamlomo, Marcellin Atemkeng, Yusuf Brima et al.

The large volume and complexity of medical imaging datasets are bottlenecks for storage, transmission, and processing. To tackle these challenges, the application of low-rank matrix approximation (LRMA) and its derivative, local LRMA (LLRMA) has demonstrated potential. A detailed analysis of the literature identifies LRMA and LLRMA methods applied to various imaging modalities, and the challenges and limitations associated with existing LRMA and LLRMA methods are addressed. We note a significant shift towards a preference for LLRMA in the medical imaging field since 2015, demonstrating its potential and effectiveness in capturing complex structures in medical data compared to LRMA. Acknowledging the limitations of shallow similarity methods used with LLRMA, we suggest advanced semantic image segmentation for similarity measure, explaining in detail how it can be used to measure similar patches and its feasibility. We note that LRMA and LLRMA are mainly applied to unstructured medical data, and we propose extending their application to different medical data types, including structured and semi-structured. This paper also discusses how LRMA and LLRMA can be applied to regular data with missing entries and the impact of inaccuracies in predicting missing values and their effects. We discuss the impact of patch size and propose the use of random search (RS) to determine the optimal patch size. To enhance feasibility, a hybrid approach using Bayesian optimization and RS is proposed, which could improve the application of LRMA and LLRMA in medical imaging.

LGMay 13, 2025
Clustering-Based Low-Rank Matrix Approximation for Medical Image Compression

Sisipho Hamlomo, Marcellin Atemkeng

Medical images are inherently high-resolution and contain locally varying structures crucial for diagnosis. Efficient compression must preserve diagnostic fidelity while minimizing redundancy. Low-rank matrix approximation (LoRMA) techniques have shown strong potential for image compression by capturing global correlations; however, they often fail to adapt to local structural variations across regions of interest. To address this, we introduce an adaptive LoRMA, which partitions a medical image into overlapping patches, groups structurally similar patches into clusters using k-means, and performs SVD within each cluster. We derive the overall compression factor accounting for patch overlap and analyze how patch size influences compression efficiency and computational cost. While applicable to any data with high local variation, we focus on medical imaging due to its pronounced local variability. We evaluate and compare our adaptive LoRMA against global SVD across four imaging modalities: MRI, ultrasound, CT scan, and chest X-ray. Results demonstrate that adaptive LoRMA effectively preserves structural integrity, edge details, and diagnostic relevance, measured by PSNR, SSIM, MSE, IoU, and EPI. Adaptive LoRMA minimizes block artifacts and residual errors, particularly in pathological regions, consistently outperforming global SVD in PSNR, SSIM, IoU, EPI, and achieving lower MSE. It prioritizes clinically salient regions while allowing aggressive compression in non-critical regions, optimizing storage efficiency. Although adaptive LoRMA requires higher processing time, its diagnostic fidelity justifies the overhead for high-compression applications.

LGFeb 3, 2025
Unsupervised anomaly detection in large-scale estuarine acoustic telemetry data

Siphendulwe Zaza, Marcellin Atemkeng, Taryn S. Murray et al.

Acoustic telemetry data plays a vital role in understanding the behaviour and movement of aquatic animals. However, these datasets, which often consist of millions of individual data points, frequently contain anomalous movements that pose significant challenges. Traditionally, anomalous movements are identified either manually or through basic statistical methods, approaches that are time-consuming and prone to high rates of unidentified anomalies in large datasets. This study focuses on the development of automated classifiers for a large telemetry dataset comprising detections from fifty acoustically tagged dusky kob monitored in the Breede Estuary, South Africa. Using an array of 16 acoustic receivers deployed throughout the estuary between 2016 and 2021, we collected over three million individual data points. We present detailed guidelines for data pre-processing, resampling strategies, labelling process, feature engineering, data splitting methodologies, and the selection and interpretation of machine learning and deep learning models for anomaly detection. Among the evaluated models, neural networks autoencoder (NN-AE) demonstrated superior performance, aided by our proposed threshold-finding algorithm. NN-AE achieved a high recall with no false normal (i.e., no misclassifications of anomalous movements as normal patterns), a critical factor in ensuring that no true anomalies are overlooked. In contrast, other models exhibited false normal fractions exceeding 0.9, indicating they failed to detect the majority of true anomalies; a significant limitation for telemetry studies where undetected anomalies can distort interpretations of movement patterns. While the NN-AE's performance highlights its reliability and robustness in detecting anomalies, it faced challenges in accurately learning normal movement patterns when these patterns gradually deviated from anomalous ones.

LGFeb 11, 2022
Predicting Fuel Consumption in Power Generation Plants using Machine Learning and Neural Networks

Gabin Maxime Nguegnang, Marcellin Atemkeng, Theophilus Ansah-Narh et al.

The instability of power generation from national grids has led industries (e.g., telecommunication) to rely on plant generators to run their businesses. However, these secondary generators create additional challenges such as fuel leakages in and out of the system and perturbations in the fuel level gauges. Consequently, telecommunication operators have been involved in a constant need for fuel to supply diesel generators. With the increase in fuel prices due to socio-economic factors, excessive fuel consumption and fuel pilferage become a problem, and this affects the smooth run of the network companies. In this work, we compared four machine learning algorithms (i.e. Gradient Boosting, Random Forest, Neural Network, and Lasso) to predict the amount of fuel consumed by a power generation plant. After evaluating the predictive accuracy of these models, the Gradient Boosting model out-perform the other three regressor models with the highest Nash efficiency value of 99.1%.

AIAug 23, 2021
Automatic Speech Recognition And Limited Vocabulary: A Survey

Jean Louis K. E. Fendji, Diane C. M. Tala, Blaise O. Yenke et al.

Automatic Speech Recognition (ASR) is an active field of research due to its large number of applications and the proliferation of interfaces or computing devices that can support speech processing. However, the bulk of applications are based on well-resourced languages that overshadow under-resourced ones. Yet, ASR represents an undeniable means to promote such languages, especially when designing human-to-human or human-to-machine systems involving illiterate people. An approach to design an ASR system targeting under-resourced languages is to start with a limited vocabulary. ASR using a limited vocabulary is a subset of the speech recognition problem that focuses on the recognition of a small number of words or sentences. This paper aims to provide a comprehensive view of mechanisms behind ASR systems as well as techniques, tools, projects, recent contributions, and possible future directions in ASR using a limited vocabulary. This work consequently provides a way forward when designing an ASR system using limited vocabulary. Although an emphasis is put on limited vocabulary, most of the tools and techniques reported in this survey can be applied to ASR systems in general.