M. M. Manohara Pai

h-index27

3papers

56citations

Novelty42%

AI Score29

Ranked #145,257 of 194,257 authors (top 75%)#47,616 in CV (top 81%)

3 Papers

1.4CVAug 13, 2022

Enhanced Vehicle Re-identification for ITS: A Feature Fusion approach using Deep Learning

Ashutosh Holla B, Manohara Pai M. M, Ujjwal Verma et al.

In recent years, the development of robust Intelligent transportation systems (ITS) is tackled across the globe to provide better traffic efficiency by reducing frequent traffic problems. As an application of ITS, vehicle re-identification has gained ample interest in the domain of computer vision and robotics. Convolutional neural network (CNN) based methods are developed to perform vehicle re-identification to address key challenges such as occlusion, illumination change, scale, etc. The advancement of transformers in computer vision has opened an opportunity to explore the re-identification process further to enhance performance. In this paper, a framework is developed to perform the re-identification of vehicles across CCTV cameras. To perform re-identification, the proposed framework fuses the vehicle representation learned using a CNN and a transformer model. The framework is tested on a dataset that contains 81 unique vehicle identities observed across 20 CCTV cameras. From the experiments, the fused vehicle re-identification framework yields an mAP of 61.73% which is significantly better when compared with the standalone CNN or transformer model.

2.6CVMar 29, 2022Code

Contextual Information Based Anomaly Detection for a Multi-Scene UAV Aerial Videos

Girisha S, Ujjwal Verma, Manohara Pai M M et al.

UAV based surveillance is gaining much interest worldwide due to its extensive applications in monitoring wildlife, urban planning, disaster management, campus security, etc. These videos are analyzed for strange/odd/anomalous patterns which are essential aspects of surveillance. But manual analysis of these videos is tedious and laborious. Hence, the development of computer-aided systems for the analysis of UAV based surveillance videos is crucial. Despite this interest, in literature, several computer aided systems are developed focusing only on CCTV based surveillance videos. These methods are designed for single scene scenarios and lack contextual knowledge which is required for multi-scene scenarios. Furthermore, the lack of standard UAV based anomaly detection datasets limits the development of these systems. In this regard, the present work aims at the development of a Computer Aided Decision support system to analyse UAV based surveillance videos. A new UAV based multi-scene anomaly detection dataset is developed with frame-level annotations for the development of computer aided systems. It holistically uses contextual, temporal and appearance features for accurate detection of anomalies. Furthermore, a new inference strategy is proposed that utilizes few anomalous samples along with normal samples to identify better decision boundaries. The proposed method is extensively evaluated on the UAV based anomaly detection dataset and performed competitively with respect to state-of-the-art methods.

3.3CVNov 29, 2020Code

UVid-Net: Enhanced Semantic Segmentation of UAV Aerial Videos by Embedding Temporal Information

Girisha S, Ujjwal Verma, Manohara Pai M M et al.

Semantic segmentation of aerial videos has been extensively used for decision making in monitoring environmental changes, urban planning, and disaster management. The reliability of these decision support systems is dependent on the accuracy of the video semantic segmentation algorithms. The existing CNN based video semantic segmentation methods have enhanced the image semantic segmentation methods by incorporating an additional module such as LSTM or optical flow for computing temporal dynamics of the video which is a computational overhead. The proposed research work modifies the CNN architecture by incorporating temporal information to improve the efficiency of video semantic segmentation. In this work, an enhanced encoder-decoder based CNN architecture (UVid-Net) is proposed for UAV video semantic segmentation. The encoder of the proposed architecture embeds temporal information for temporally consistent labelling. The decoder is enhanced by introducing the feature-refiner module, which aids in accurate localization of the class labels. The proposed UVid-Net architecture for UAV video semantic segmentation is quantitatively evaluated on extended ManipalUAVid dataset. The performance metric mIoU of 0.79 has been observed which is significantly greater than the other state-of-the-art algorithms. Further, the proposed work produced promising results even for the pre-trained model of UVid-Net on urban street scene with fine tuning the final layer on UAV aerial videos.