Saimunur Rahman

4papers

14citations

Novelty39%

AI Score23

Ranked #178,973 of 201,326 authors (top 89%)#54,184 in CV (top 92%)

4 Papers

CVApr 23, 2023

Learning Partial Correlation based Deep Visual Representation for Image Classification

Saimunur Rahman, Piotr Koniusz, Lei Wang et al.

Visual representation based on covariance matrix has demonstrates its efficacy for image classification by characterising the pairwise correlation of different channels in convolutional feature maps. However, pairwise correlation will become misleading once there is another channel correlating with both channels of interest, resulting in the ``confounding'' effect. For this case, ``partial correlation'' which removes the confounding effect shall be estimated instead. Nevertheless, reliably estimating partial correlation requires to solve a symmetric positive definite matrix optimisation, known as sparse inverse covariance estimation (SICE). How to incorporate this process into CNN remains an open issue. In this work, we formulate SICE as a novel structured layer of CNN. To ensure end-to-end trainability, we develop an iterative method to solve the above matrix optimisation during forward and backward propagation steps. Our work obtains a partial correlation based deep visual representation and mitigates the small sample problem often encountered by covariance matrix estimation in CNN. Computationally, our model can be effectively trained with GPU and works well with a large number of channels of advanced CNNs. Experiments show the efficacy and superior classification performance of our deep visual representation compared to covariance matrix based counterparts.

CVSep 24, 2024

Point-PNG: Conditional Pseudo-Negatives Generation for Point Cloud Pre-Training

Sutharsan Mahendren, Saimunur Rahman, Piotr Koniusz et al.

We propose Point-PNG, a novel self-supervised learning framework that generates conditional pseudo-negatives in the latent space to learn point cloud representations that are both discriminative and transformation-sensitive. Conventional self-supervised learning methods focus on achieving invariance, discarding transformation-specific information. Recent approaches incorporate transformation sensitivity by explicitly modeling relationships between original and transformed inputs. However, they often suffer from an invariant-collapse phenomenon, where the predictor degenerates into identity mappings, resulting in latent representations with limited variation across transformations. To address this, we propose Point-PNG that explicitly penalizes invariant collapse through pseudo-negatives generation, enabling the network to capture richer transformation cues while preserving discriminative representations. To this end, we introduce a parametric network, COnditional Pseudo-Negatives Embedding (COPE), which learns localized displacements induced by transformations within the latent space. A key challenge arises when jointly training COPE with the MAE, as it tends to converge to trivial identity mappings. To overcome this, we design a loss function based on pseudo-negatives conditioned on the transformation, which penalizes such trivial invariant solutions and enforces meaningful representation learning. We validate Point-PNG on shape classification and relative pose estimation tasks, showing competitive performance on ModelNet40 and ScanObjectNN under challenging evaluation protocols, and achieving superior accuracy in relative pose estimation compared to supervised baselines.

CVSep 24, 2024

A Deeper Look into Second-Order Feature Aggregation for LiDAR Place Recognition

Saimunur Rahman, Peyman Moghadam

Efficient LiDAR Place Recognition (LPR) compresses dense pointwise features into compact global descriptors. While first-order aggregators such as GeM and NetVLAD are widely used, they overlook inter-feature correlations that second-order aggregation naturally captures. Full covariance, a common second-order aggregator, is high in dimensionality; as a result, practitioners often insert a learned projection or employ random sketches -- both of which either sacrifice information or increase parameter count. However, no prior work has systematically investigated how first- and second-order aggregation perform under constrained feature and compute budgets. In this paper, we first demonstrate that second-order aggregation retains its superiority for LPR even when channels are pruned and backbone parameters are reduced. Building on this insight, we propose Channel Partition-based Second-order Local Feature Aggregation (CPS): a drop-in, partition-based second-order aggregation module that preserves all channels while producing an order-of-magnitude smaller descriptor. CPS matches or exceeds the performance of full covariance and outperforms random projection variants, delivering new state-of-the-art results with only four additional learnable parameters across four large-scale benchmarks: Oxford RobotCar, In-house, MulRan, and WildPlaces.

CVNov 20, 2019

Deep Learning based HEp-2 Image Classification: A Comprehensive Review

Saimunur Rahman, Lei Wang, Changming Sun et al.

Classification of HEp-2 cell patterns plays a significant role in the indirect immunofluorescence test for identifying autoimmune diseases in the human body. Many automatic HEp-2 cell classification methods have been proposed in recent years, amongst which deep learning based methods have shown impressive performance. This paper provides a comprehensive review of the existing deep learning based HEp-2 cell image classification methods. These methods perform HEp-2 image classification at two levels, namely, cell-level and specimen-level. Both levels are covered in this review. At each level, the methods are organized with a deep network usage based taxonomy. The core idea, notable achievements, and key strengths and weaknesses of each method are critically analyzed. Furthermore, a concise review of the existing HEp-2 datasets that are commonly used in the literature is given. The paper ends with a discussion on novel opportunities and future research directions in this field. It is hoped that this paper would provide readers with a thorough reference of this novel, challenging, and thriving field.