Tan Viet Tuyen Nguyen

CV
h-index20
4papers
9citations
Novelty30%
AI Score39

4 Papers

CVMay 31
One Channel to Rule Them All: Rethinking Input Representation for Visual Place Recognition

Timur Ismagilov, Shakaiba Majeed, Michael Milford et al.

Visual Place Recognition (VPR) is fundamental to long-term robot localization and SLAM, yet current systems overwhelmingly rely on RGB input, implicitly assuming color is necessary for global place recognition. We challenge this assumption, investigating the role of chromatic information across training regimes, model architectures and standard benchmarks under real-world appearance variation. We find that grayscale matches RGB performance generally and outperforms it under severe appearance shifts where color invariance is insufficiently learned, while color provides meaningful gains only where persistent and discriminative chromatic cues are present. Across selected benchmarks, a fully gray-trained MixVPR model achieves an average 82.4% Recall@1 compared to 81.2% for its RGB counterpart. In some cases, lightweight grayscale variants with 60% fewer parameters can outperform heavier RGB models. Grayscale further offers practical advantages in storage, bandwidth and alignment with resource-constrained systems. We conclude that for global VPR where scenes vary across illumination, weather, season and setting, color contributes minimally, and grayscale alone is sufficient for reliable place recognition.

MAMay 14
Decision-Level Fusion for Robust Wearable Affect Recognition

Lokesh Singh, Athina Georgara, Jayati Deshmukh et al.

Automatic recognition of affective state from wearable physiology has clear societal impact for public health, preventive care, and stress-aware interventions, but real deployments require robustness to non-stationary dynamics, artefacts, and missing sensors. We study this problem on WESAD, using baseline, stress, and amusement conditions, where common fixed-basis spectral features such as FFT bandpower and Welch PSD can oversmooth short-lived discriminative patterns. We propose a non-stationary pipeline that combines Fourier-Bessel Series Expansion (FBSE) with EWT data-driven spectral segmentation to extract mode-wise transient descriptors. For multimodal integration, we adopt decision-level aggregation over per-modality predictors and weight each modality by predictive uncertainty and modality reliability. Results on WESAD, using 15 subjects and ECG, EDA, BVP, EMG, and ACC signals across three classes, indicate that decision-level aggregation is approximately 84 percent of the time at least as good as feature-level aggregation, and approximately 48 percent of the time strictly better, suggesting improved robustness under heterogeneous and partially reliable sensing.

CVDec 10, 2024
On Motion Blur and Deblurring in Visual Place Recognition

Timur Ismagilov, Bruno Ferrarini, Michael Milford et al.

Visual Place Recognition (VPR) in mobile robotics enables robots to localize themselves by recognizing previously visited locations using visual data. While the reliability of VPR methods has been extensively studied under conditions such as changes in illumination, season, weather and viewpoint, the impact of motion blur is relatively unexplored despite its relevance not only in rapid motion scenarios but also in low-light conditions where longer exposure times are necessary. Similarly, the role of image deblurring in enhancing VPR performance under motion blur has received limited attention so far. This paper bridges these gaps by introducing a new benchmark designed to evaluate VPR performance under the influence of motion blur and image deblurring. The benchmark includes three datasets that encompass a wide range of motion blur intensities, providing a comprehensive platform for analysis. Experimental results with several well-established VPR and image deblurring methods provide new insights into the effects of motion blur and the potential improvements achieved through deblurring. Building on these findings, the paper proposes adaptive deblurring strategies for VPR, designed to effectively manage motion blur in dynamic, real-world scenarios.

CVOct 20, 2025
Joint Multi-Condition Representation Modelling via Matrix Factorisation for Visual Place Recognition

Timur Ismagilov, Shakaiba Majeed, Michael Milford et al.

We address multi-reference visual place recognition (VPR), where reference sets captured under varying conditions are used to improve localisation performance. While deep learning with large-scale training improves robustness, increasing data diversity and model complexity incur extensive computational cost during training and deployment. Descriptor-level fusion via voting or aggregation avoids training, but often targets multi-sensor setups or relies on heuristics with limited gains under appearance and viewpoint change. We propose a training-free, descriptor-agnostic approach that jointly models places using multiple reference descriptors via matrix decomposition into basis representations, enabling projection-based residual matching. We also introduce SotonMV, a structured benchmark for multi-viewpoint VPR. On multi-appearance data, our method improves Recall@1 by up to ~18% over single-reference and outperforms multi-reference baselines across appearance and viewpoint changes, with gains of ~5% on unstructured data, demonstrating strong generalisation while remaining lightweight.