Ravi S. Hegde

h-index22

4papers

81citations

Novelty29%

AI Score34

Ranked #112,921 of 194,257 authors (top 58%)#37,713 in CV (top 64%)

4 Papers

1.5CVAug 9, 2023

Towards AI enabled automated tracking of multiple boxers

A. S. Karthikeyan, Vipul Baghel, Anish Monsley Kirupakaran et al.

Continuous tracking of boxers across multiple training sessions helps quantify traits required for the well-known ten-point-must system. However, continuous tracking of multiple athletes across multiple training sessions remains a challenge, because it is difficult to precisely segment bout boundaries in a recorded video stream. Furthermore, re-identification of the same athlete over different period or even within the same bout remains a challenge. Difficulties are further compounded when a single fixed view video is captured in top-view. This work summarizes our progress in creating a system in an economically single fixed top-view camera. Specifically, we describe improved algorithm for bout transition detection and in-bout continuous player identification without erroneous ID updation or ID switching. From our custom collected data of ~11 hours (athlete count: 45, bouts: 189), our transition detection algorithm achieves 90% accuracy and continuous ID tracking achieves IDU=0, IDS=0.

0.9CVJun 21, 2017Code

GPGPU Acceleration of the KAZE Image Feature Extraction Algorithm

Ramkumar B, R. S. Hegde, Rob Laber et al.

The recently proposed open-source KAZE image feature detection and description algorithm offers unprecedented performance in comparison to conventional ones like SIFT and SURF as it relies on nonlinear scale spaces instead of Gaussian linear scale spaces. The improved performance, however, comes with a significant computational cost limiting its use for many applications. We report a GPGPU implementation of the KAZE algorithm without resorting to binary descriptors for gaining speedup. For a 1920 by 1200 sized image our Compute Unified Device Architecture (CUDA) C based GPU version took around 300 milliseconds on a NVIDIA GeForce GTX Titan X (Maxwell Architecture-GM200) card in comparison to nearly 2400 milliseconds for a multithreaded CPU version (16 threaded Intel(R) Xeon(R) CPU E5-2650 processsor). The CUDA based parallel implementation is described in detail with fine-grained comparison between the GPU and CPU implementations. By achieving nearly 8 fold speedup without performance degradation our work expands the applicability of the KAZE algorithm. Additionally, the strategies described here can prove useful for the GPU implementation of other nonlinear scale space based methods.

6.2CVNov 20, 2025

BoxingVI: A Multi-Modal Benchmark for Boxing Action Recognition and Localization

Rahul Kumar, Vipul Baghel, Sudhanshu Singh et al.

Accurate analysis of combat sports using computer vision has gained traction in recent years, yet the development of robust datasets remains a major bottleneck due to the dynamic, unstructured nature of actions and variations in recording environments. In this work, we present a comprehensive, well-annotated video dataset tailored for punch detection and classification in boxing. The dataset comprises 6,915 high-quality punch clips categorized into six distinct punch types, extracted from 20 publicly available YouTube sparring sessions and involving 18 different athletes. Each clip is manually segmented and labeled to ensure precise temporal boundaries and class consistency, capturing a wide range of motion styles, camera angles, and athlete physiques. This dataset is specifically curated to support research in real-time vision-based action recognition, especially in low-resource and unconstrained environments. By providing a rich benchmark with diverse punch examples, this contribution aims to accelerate progress in movement analysis, automated coaching, and performance assessment within boxing and related domains.

9.3CVJun 21, 2017

Object Detection Using Deep CNNs Trained on Synthetic Images

Param S. Rajpura, Hristo Bojinov, Ravi S. Hegde

The need for large annotated image datasets for training Convolutional Neural Networks (CNNs) has been a significant impediment for their adoption in computer vision applications. We show that with transfer learning an effective object detector can be trained almost entirely on synthetically rendered datasets. We apply this strategy for detecting pack- aged food products clustered in refrigerator scenes. Our CNN trained only with 4000 synthetic images achieves mean average precision (mAP) of 24 on a test set with 55 distinct products as objects of interest and 17 distractor objects. A further increase of 12% in the mAP is obtained by adding only 400 real images to these 4000 synthetic images in the training set. A high degree of photorealism in the synthetic images was not essential in achieving this performance. We analyze factors like training data set size and 3D model dictionary size for their influence on detection performance. Additionally, training strategies like fine-tuning with selected layers and early stopping which affect transfer learning from synthetic scenes to real scenes are explored. Training CNNs with synthetic datasets is a novel application of high-performance computing and a promising approach for object detection applications in domains where there is a dearth of large annotated image data.