CVJun 6, 2023
Empir3D : A Framework for Multi-Dimensional Point Cloud AssessmentYash Turkar, Pranay Meshram, Christo Aluckal et al.
Advancements in sensors, algorithms, and compute hardware have made 3D perception feasible in real time. Current methods to compare and evaluate the quality of a 3D model, such as Chamfer, Hausdorff, and Earth-Mover's distance, are uni-dimensional and have limitations, including an inability to capture coverage, local variations in density and error, and sensitivity to outliers. In this paper, we propose an evaluation framework for point clouds (Empir3D) that consists of four metrics: resolution to quantify the ability to distinguish between individual parts in the point cloud, accuracy to measure registration error, coverage to evaluate the portion of missing data, and artifact score to characterize the presence of artifacts. Through detailed analysis, we demonstrate the complementary nature of each of these dimensions and the improvements they provide compared to the aforementioned uni-dimensional measures. Furthermore, we illustrate the utility of Empir3D by comparing our metrics with uni-dimensional metrics for two 3D perception applications (SLAM and point cloud completion). We believe that Empir3D advances our ability to reason about point clouds and helps better debug 3D perception applications by providing a richer evaluation of their performance. Our implementation of Empir3D, custom real-world datasets, evaluations on learning methods, and detailed documentation on how to integrate the pipeline will be made available upon publication.
CVNov 21, 2025
QAL: A Loss for Recall Precision Balance in 3D ReconstructionPranay Meshram, Yash Turkar, Kartikeya Singh et al.
Volumetric learning underpins many 3D vision tasks such as completion, reconstruction, and mesh generation, yet training objectives still rely on Chamfer Distance (CD) or Earth Mover's Distance (EMD), which fail to balance recall and precision. We propose Quality-Aware Loss (QAL), a drop-in replacement for CD/EMD that combines a coverage-weighted nearest-neighbor term with an uncovered-ground-truth attraction term, explicitly decoupling recall and precision into tunable components. Across diverse pipelines, QAL achieves consistent coverage gains, improving by an average of +4.3 pts over CD and +2.8 pts over the best alternatives. Though modest in percentage, these improvements reliably recover thin structures and under-represented regions that CD/EMD overlook. Extensive ablations confirm stable performance across hyperparameters and across output resolutions, while full retraining on PCN and ShapeNet demonstrates generalization across datasets and backbones. Moreover, QAL-trained completions yield higher grasp scores under GraspNet evaluation, showing that improved coverage translates directly into more reliable robotic manipulation. QAL thus offers a principled, interpretable, and practical objective for robust 3D vision and safety-critical robotics pipelines
CVMay 22, 2025
Detailed Evaluation of Modern Machine Learning Approaches for Optic Plastics SortingVaishali Maheshkar, Aadarsh Anantha Ramakrishnan, Charuvahan Adhivarahan et al.
According to the EPA, only 25% of waste is recycled, and just 60% of U.S. municipalities offer curbside recycling. Plastics fare worse, with a recycling rate of only 8%; an additional 16% is incinerated, while the remaining 76% ends up in landfills. The low plastic recycling rate stems from contamination, poor economic incentives, and technical difficulties, making efficient recycling a challenge. To improve recovery, automated sorting plays a critical role. Companies like AMP Robotics and Greyparrot utilize optical systems for sorting, while Materials Recovery Facilities (MRFs) employ Near-Infrared (NIR) sensors to detect plastic types. Modern optical sorting uses advances in computer vision such as object recognition and instance segmentation, powered by machine learning. Two-stage detectors like Mask R-CNN use region proposals and classification with deep backbones like ResNet. Single-stage detectors like YOLO handle detection in one pass, trading some accuracy for speed. While such methods excel under ideal conditions with a large volume of labeled training data, challenges arise in realistic scenarios, emphasizing the need to further examine the efficacy of optic detection for automated sorting. In this study, we compiled novel datasets totaling 20,000+ images from varied sources. Using both public and custom machine learning pipelines, we assessed the capabilities and limitations of optical recognition for sorting. Grad-CAM, saliency maps, and confusion matrices were employed to interpret model behavior. We perform this analysis on our custom trained models from the compiled datasets. To conclude, our findings are that optic recognition methods have limited success in accurate sorting of real-world plastics at MRFs, primarily because they rely on physical properties such as color and shape.
ROMar 15, 2019
Augmenting Visual SLAM with Wi-Fi Sensing For Indoor ApplicationsZakieh S. Hashemifar, Charuvahan Adhivarahan, Anand Balakrishnan et al.
Recent trends have accelerated the development of spatial applications on mobile devices and robots. These include navigation, augmented reality, human-robot interaction, and others. A key enabling technology for such applications is the understanding of the device's location and the map of the surrounding environment. This generic problem, referred to as Simultaneous Localization and Mapping (SLAM), is an extensively researched topic in robotics. However, visual SLAM algorithms face several challenges including perceptual aliasing and high computational cost. These challenges affect the accuracy, efficiency, and viability of visual SLAM algorithms, especially for long-term SLAM, and their use in resource-constrained mobile devices. A parallel trend is the ubiquity of Wi-Fi routers for quick Internet access in most urban environments. Most robots and mobile devices are equipped with a Wi-Fi radio as well. We propose a method to utilize Wi-Fi received signal strength to alleviate the challenges faced by visual SLAM algorithms. To demonstrate the utility of this idea, this work makes the following contributions: (i) We propose a generic way to integrate Wi-Fi sensing into visual SLAM algorithms, (ii) We integrate such sensing into three well-known SLAM algorithms, (iii) Using four distinct datasets, we demonstrate the performance of such augmentation in comparison to the original visual algorithms and (iv) We compare our work to Wi-Fi augmented FABMAP algorithm. Overall, we show that our approach can improve the accuracy of visual SLAM algorithms by 11% on average and reduce computation time on average by 15% to 25%.