CVJan 17, 2020
Unsupervised Learning of Camera Pose with Compositional Re-estimationSeyed Shahabeddin Nabavi, Mehrdad Hosseinzadeh, Ramin Fahimi et al.
We consider the problem of unsupervised camera pose estimation. Given an input video sequence, our goal is to estimate the camera pose (i.e. the camera motion) between consecutive frames. Traditionally, this problem is tackled by placing strict constraints on the transformation vector or by incorporating optical flow through a complex pipeline. We propose an alternative approach that utilizes a compositional re-estimation process for camera pose estimation. Given an input, we first estimate a depth map. Our method then iteratively estimates the camera motion based on the estimated depth map. Our approach significantly improves the predicted camera motion both quantitatively and visually. Furthermore, the re-estimation resolves the problem of out-of-boundaries pixels in a novel and simple way. Another advantage of our approach is that it is adaptable to other camera pose estimation approaches. Experimental analysis on KITTI benchmark dataset demonstrates that our method outperforms existing state-of-the-art approaches in unsupervised camera ego-motion estimation.
CVSep 5, 2019
Future Frame Prediction Using Convolutional VRNN for Anomaly DetectionYiwei Lu, Mahesh Kumar Krishna Reddy, Seyed shahabeddin Nabavi et al.
Anomaly detection in videos aims at reporting anything that does not conform the normal behaviour or distribution. However, due to the sparsity of abnormal video clips in real life, collecting annotated data for supervised learning is exceptionally cumbersome. Inspired by the practicability of generative models for semi-supervised learning, we propose a novel sequential generative model based on variational autoencoder (VAE) for future frame prediction with convolutional LSTM (ConvLSTM). To the best of our knowledge, this is the first work that considers temporal information in future frame prediction based anomaly detection framework from the model perspective. Our experiments demonstrate that our approach is superior to the state-of-the-art methods on three benchmark datasets.
CVJul 20, 2018
Future Semantic Segmentation with Convolutional LSTMSeyed shahabeddin Nabavi, Mrigank Rochan, Yang et al.
We consider the problem of predicting semantic segmentation of future frames in a video. Given several observed frames in a video, our goal is to predict the semantic segmentation map of future frames that are not yet observed. A reliable solution to this problem is useful in many applications that require real-time decision making, such as autonomous driving. We propose a novel model that uses convolutional LSTM (ConvLSTM) to encode the spatiotemporal information of observed frames for future prediction. We also extend our model to use bidirectional ConvLSTM to capture temporal information in both directions. Our proposed approach outperforms other state-of-the-art methods on the benchmark dataset.