Rashmi Bhaskara

ROMar 8, 2023

SG-LSTM: Social Group LSTM for Robot Navigation Through Dense Crowds

Rashmi Bhaskara, Maurice Chiu, Aniket Bera

With the increasing availability and affordability of personal robots, they will no longer be confined to large corporate warehouses or factories but will instead be expected to operate in less controlled environments alongside larger groups of people. In addition to ensuring safety and efficiency, it is crucial to minimize any negative psychological impact robots may have on humans and follow unwritten social norms in these situations. Our research aims to develop a model that can predict the movements of pedestrians and perceptually-social groups in crowded environments. We introduce a new Social Group Long Short-term Memory (SG-LSTM) model that models human groups and interactions in dense environments using a socially-aware LSTM to produce more accurate trajectory predictions. Our approach enables navigation algorithms to calculate collision-free paths faster and more accurately in crowded environments. Additionally, we also release a large video dataset with labeled pedestrian groups for the broader social navigation community. We show comparisons with different metrics on different datasets (ETH, Hotel, MOT15) and different prediction approaches (LIN, LSTM, O-LSTM, S-LSTM) as well as runtime performance.

CVNov 19, 2022

AdaFNIO: Adaptive Fourier Neural Interpolation Operator for video frame interpolation

Hrishikesh Viswanath, Md Ashiqur Rahman, Rashmi Bhaskara et al.

We present, AdaFNIO - Adaptive Fourier Neural Interpolation Operator, a neural operator-based architecture to perform video frame interpolation. Current deep learning based methods rely on local convolutions for feature learning and suffer from not being scale-invariant, thus requiring training data to be augmented through random flipping and re-scaling. On the other hand, AdaFNIO, learns the features in the frames, independent of input resolution, through token mixing and global convolution in the Fourier space or the spectral domain by using Fast Fourier Transform (FFT). We show that AdaFNIO can produce visually smooth and accurate results. To evaluate the visual quality of our interpolated frames, we calculate the structural similarity index (SSIM) and Peak Signal to Noise Ratio (PSNR) between the generated frame and the ground truth frame. We provide the quantitative performance of our model on Vimeo-90K dataset, DAVIS, UCF101 and DISFA+ dataset.

Rashmi Bhaskara

2 Papers