NAMar 18
Novel technique based on Léja Points Approximation for Log-determinant Estimation of Large matricesVerlon Roel Mbingui, Antoine Tambue, Issa Karambal
The computation of the Log-determinant of large, sparse, symmetric positive definite (SPD) matrices is essential in many scientific computational fields such as numerical linear algebra and machine learning. In low dimensions, Cholesky is preferred, but in high dimensions, its computation may be prohibitive due to memory limitation. To circumvent this, Krylov subspace techniques have proven to be efficient but may be computationally expensive due to the required orthogonalization processes. In this paper, we introduce a novel technique to estimate the Log-determinant of a matrix using Léja points, where the implementation is only based on matrix multiplications and a rough estimation of eigenvalue bounds of the matrix. By coupling Léja points interpolation with a randomized algorithm called Hutch++, we achieve substantial reductions in computational complexity while preserving significant accuracy compared to the stochastic Lanczos quadrature. We establish the approximation errors of the matrix function together with multiplicative error bounds for the approximations obtained by this method. The effectiveness and scalability of the proposed method on both large sparse synthetic matrices (maximum likelihood in Gaussian Markov Random fields) and large-scale real-world matrices are confirmed through numerical experiments.
CVMar 19
Vision Tiny Recursion Model (ViTRM): Parameter-Efficient Image Classification via Recursive State RefinementAnge-Clément Akazan, Abdoulaye Koroko, Verlon Roel Mbingui et al.
The success of deep learning in computer vision has been driven by models of increasing scale, from deep Convolutional Neural Networks (CNN) to large Vision Transformers (ViT). While effective, these architectures are parameter-intensive and demand significant computational resources, limiting deployment in resource-constrained environments. Inspired by Tiny Recursive Models (TRM), which show that small recursive networks can solve complex reasoning tasks through iterative state refinement, we introduce the \textbf{Vision Tiny Recursion Model (ViTRM)}: a parameter-efficient architecture that replaces the $L$-layer ViT encoder with a single tiny $k$-layer block ($k{=}3$) applied recursively $N$ times. Despite using up to $6 \times $ and $84 \times$ fewer parameters than CNN based models and ViT respectively, ViTRM maintains competitive performance on CIFAR-10 and CIFAR-100. This demonstrates that recursive computation is a viable, parameter-efficient alternative to architectural depth in vision.
LGMay 27, 2025
Localized Weather Prediction Using Kolmogorov-Arnold Network-Based Models and Deep RNNsAnge-Clement Akazan, Verlon Roel Mbingui, Gnankan Landry Regis N'guessan et al.
Weather forecasting is crucial for managing risks and economic planning, particularly in tropical Africa, where extreme events severely impact livelihoods. Yet, existing forecasting methods often struggle with the region's complex, non-linear weather patterns. This study benchmarks deep recurrent neural networks such as $\texttt{LSTM, GRU, BiLSTM, BiGRU}$, and Kolmogorov-Arnold-based models $(\texttt{KAN} and \texttt{TKAN})$ for daily forecasting of temperature, precipitation, and pressure in two tropical cities: Abidjan, Cote d'Ivoire (Ivory Coast) and Kigali (Rwanda). We further introduce two customized variants of $ \texttt{TKAN}$ that replace its original $\texttt{SiLU}$ activation function with $ \texttt{GeLU}$ and \texttt{MiSH}, respectively. Using station-level meteorological data spanning from 2010 to 2024, we evaluate all the models on standard regression metrics. $\texttt{KAN}$ achieves temperature prediction ($R^2=0.9986$ in Abidjan, $0.9998$ in Kigali, $\texttt{MSE} < 0.0014~^\circ C ^2$), while $\texttt{TKAN}$ variants minimize absolute errors for precipitation forecasting in low-rainfall regimes. The customized $\texttt{TKAN}$ models demonstrate improvements over the standard $\texttt{TKAN}$ across both datasets. Classical \texttt{RNNs} remain highly competitive for atmospheric pressure ($R^2 \approx 0.83{-}0.86$), outperforming $\texttt{KAN}$-based models in this task. These results highlight the potential of spline-based neural architectures for efficient and data-efficient forecasting.
LGSep 27, 2025
Splines-Based Feature Importance in Kolmogorov-Arnold Networks: A Framework for Supervised Tabular Data Dimensionality ReductionAnge-Clément Akazan, Verlon Roel Mbingui
High-dimensional datasets require effective feature selection to improve predictive performance, interpretability, and robustness. We propose and evaluate feature selection methods for tabular datasets based on Kolmogorov-Arnold networks (KANs), which parameterize feature transformations through splines, enabling direct access to interpretable importance measures. We introduce four KAN-based selectors ($\textit{KAN-L1}$, $\textit{KAN-L2}$, $\textit{KAN-SI}$, $\textit{KAN-KO}$) and compare them against classical baselines (LASSO, Random Forest, Mutual Information, SVM-RFE) across multiple classification and regression tabular dataset benchmarks. Average (over three retention levels: 20\%, 40\%, and 60\%) F1 scores and $R^2$ score results reveal that KAN-based selectors, particularly $\textit{KAN-L2}$, $\textit{KAN-L1}$, $\textit{KAN-SI}$, and $\textit{KAN-KO}$, are competitive with and sometimes superior to classical baselines in structured and synthetic datasets. However, $\textit{KAN-L1}$ is often too aggressive in regression, removing useful features, while $\textit{KAN-L2}$ underperforms in classification, where simple coefficient shrinkage misses complex feature interactions. $\textit{KAN-L2}$ and $\textit{KAN-SI}$ provide robust performance on noisy regression datasets and heterogeneous datasets, aligning closely with ensemble predictors. In classification tasks, KAN selectors such as $\textit{KAN-L1}$, $\textit{KAN-KO}$, and $\textit{KAN-SI}$ sometimes surpass the other selectors by eliminating redundancy, particularly in high-dimensional multi-class data. Overall, our findings demonstrate that KAN-based feature selection provides a powerful and interpretable alternative to traditional methods, capable of uncovering nonlinear and multivariate feature relevance beyond sparsity or impurity-based measures.