CVMar 26, 2023
VisDA 2022 Challenge: Domain Adaptation for Industrial Waste SortingDina Bashkirova, Samarth Mishra, Diala Lteif et al.
Label-efficient and reliable semantic segmentation is essential for many real-life applications, especially for industrial settings with high visual diversity, such as waste sorting. In industrial waste sorting, one of the biggest challenges is the extreme diversity of the input stream depending on factors like the location of the sorting facility, the equipment available in the facility, and the time of year, all of which significantly impact the composition and visual appearance of the waste stream. These changes in the data are called ``visual domains'', and label-efficient adaptation of models to such domains is needed for successful semantic segmentation of industrial waste. To test the abilities of computer vision models on this task, we present the VisDA 2022 Challenge on Domain Adaptation for Industrial Waste Sorting. Our challenge incorporates a fully-annotated waste sorting dataset, ZeroWaste, collected from two real material recovery facilities in different locations and seasons, as well as a novel procedurally generated synthetic waste sorting dataset, SynthWaste. In this competition, we aim to answer two questions: 1) can we leverage domain adaptation techniques to minimize the domain gap? and 2) can synthetic data augmentation improve performance on this task and help adapt to changing data distributions? The results of the competition show that industrial waste detection poses a real domain adaptation problem, that domain generalization techniques such as augmentations, ensembling, etc., improve the overall performance on the unlabeled target domain examples, and that leveraging synthetic data effectively remains an open problem. See https://ai.bu.edu/visda-2022/
25.3ROApr 14
Utilizing Inpainting for Keypoint Detection for Vision-Based Control of Robotic ManipulatorsSreejani Chatterjee, Venkatesh Mullur, Abhinav Gandhi et al.
In this paper we present a novel visual servoing framework to control a robotic manipulator in the configuration space by using purely natural visual features. Our goal is to develop methods that can robustly detect and track natural features or keypoints on robotic manipulators that would be used for vision-based control, especially for scenarios where placing external markers on the robot is not feasible or preferred at runtime. For the model training process of our data driven approach, we create a data collection pipeline where we attach ArUco markers along the robot's body, label their centers as keypoints, and then utilize an inpainting method to remove the markers and reconstruct the occluded regions. By doing so, we generate natural (markerless) robot images that are automatically labeled with the marker locations. These images are used to train a keypoint detection algorithm, which is used to control the robot configuration using natural features of the robot. Unlike the prior methods that rely on accurate camera calibration and robot models for labeling training images, our approach eliminates these dependencies through inpainting. To achieve robust keypoint detection even in the presence of occlusion, we introduce a second inpainting model, this time to utilize during runtime, that reconstructs occluded regions of the robot in real time, enabling continuous keypoint detection. To further enhance the consistency and robustness of keypoint predictions, we integrate an Unscented Kalman Filter (UKF) that refines the keypoint estimates over time, adding to stable and reliable control performance. We obtained successful control results with this model-free and purely vision-based control strategy, utilizing natural robot features in the runtime, both under full visibility and partial occlusion.
ROMay 1, 2021Code
ECNNs: Ensemble Learning Methods for Improving Planar Grasp Quality EstimationFadi Alladkani, James Akl, Berk Calli
We present an ensemble learning methodology that combines multiple existing robotic grasp synthesis algorithms and obtain a success rate that is significantly better than the individual algorithms. The methodology treats the grasping algorithms as "experts" providing grasp "opinions". An Ensemble Convolutional Neural Network (ECNN) is trained using a Mixture of Experts (MOE) model that integrates these opinions and determines the final grasping decision. The ECNN introduces minimal computational cost overhead, and the network can virtually run as fast as the slowest expert. We test this architecture using open-source algorithms in the literature by adopting GQCNN 4.0, GGCNN and a custom variation of GGCNN as experts and obtained a 6% increase in the grasp success on the Cornell Dataset compared to the best-performing individual algorithm. The performance of the method is also demonstrated using a Franka Emika Panda arm.
ROApr 23, 2021Code
OCRTOC: A Cloud-Based Competition and Benchmark for Robotic Grasping and ManipulationZiyuan Liu, Wei Liu, Yuzhe Qin et al.
In this paper, we propose a cloud-based benchmark for robotic grasping and manipulation, called the OCRTOC benchmark. The benchmark focuses on the object rearrangement problem, specifically table organization tasks. We provide a set of identical real robot setups and facilitate remote experiments of standardized table organization scenarios in varying difficulties. In this workflow, users upload their solutions to our remote server and their code is executed on the real robot setups and scored automatically. After each execution, the OCRTOC team resets the experimental setup manually. We also provide a simulation environment that researchers can use to develop and test their solutions. With the OCRTOC benchmark, we aim to lower the barrier of conducting reproducible research on robotic grasping and manipulation and accelerate progress in this field. Executing standardized scenarios on identical real robot setups allows us to quantify algorithm performances and achieve fair comparisons. Using this benchmark we held a competition in the 2020 International Conference on Intelligence Robots and Systems (IROS 2020). In total, 59 teams took part in this competition worldwide. We present the results and our observations of the 2020 competition, and discuss our adjustments and improvements for the upcoming OCRTOC 2021 competition. The homepage of the OCRTOC competition is www.ocrtoc.org, and the OCRTOC software package is available at https://github.com/OCRTOC/OCRTOC_software_package.
ROApr 23, 2021Code
Grasp Synthesis for Novel Objects Using Heuristic-based and Data-driven Active Vision MethodsSabhari Natarajan, Galen Brown, Berk Calli
In this work, we present several heuristic-based and data-driven active vision strategies for viewpoint optimization of an arm-mounted depth camera for the purpose of aiding robotic grasping. These strategies aim to efficiently collect data to boost the performance of an underlying grasp synthesis algorithm. We created an open-source benchmarking platform in simulation (https://github.com/galenbr/2021ActiveVision), and provide an extensive study for assessing the performance of the proposed methods as well as comparing them against various baseline strategies. We also provide an experimental study with a real-world setup by utilizing an existing grasping planning benchmark in the literature. With these analyses, we were able to quantitatively demonstrate the versatility of heuristic methods that prioritize certain types of exploration, and qualitatively show their robustness to both novel objects and the transition from simulation to the real world. We identified scenarios in which our methods did not perform well and scenarios that are objectively difficult, and present a discussion on which avenues for future research show promise.
ROMar 14, 2025
A Benchmarking Study of Vision-based Robotic Grasping AlgorithmsBharath K Rameshbabu, Sumukh S Balakrishna, Brian Flynn et al.
We present a benchmarking study of vision-based robotic grasping algorithms with distinct approaches, and provide a comparative analysis. In particular, we compare two machine-learning-based and two analytical algorithms using an existing benchmarking protocol from the literature and determine the algorithm's strengths and weaknesses under different experimental conditions. These conditions include variations in lighting, background textures, cameras with different noise levels, and grippers. We also run analogous experiments in simulations and with real robots and present the discrepancies. Some experiments are also run in two different laboratories using same protocols to further analyze the repeatability of our results. We believe that this study, comprising 5040 experiments, provides important insights into the role and challenges of systematic experimentation in robotic manipulation, and guides the development of new algorithms by considering the factors that could impact the performance. The experiment recordings and our benchmarking software are publicly available.
ROFeb 1, 2025
Simultaneous Estimation of Manipulation Skill and Hand Grasp Force from Forearm Ultrasound ImagesKeshav Bimbraw, Srikar Nekkanti, Daniel B. Tiller et al.
Accurate estimation of human hand configuration and the forces they exert is critical for effective teleoperation and skill transfer in robotic manipulation. A deeper understanding of human interactions with objects can further enhance teleoperation performance. To address this need, researchers have explored methods to capture and translate human manipulation skills and applied forces to robotic systems. Among these, biosignal-based approaches, particularly those using forearm ultrasound data, have shown significant potential for estimating hand movements and finger forces. In this study, we present a method for simultaneously estimating manipulation skills and applied hand force using forearm ultrasound data. Data collected from seven participants were used to train deep learning models for classifying manipulation skills and estimating grasp force. Our models achieved an average classification accuracy of 94.87 percent plus or minus 10.16 percent for manipulation skills and an average root mean square error (RMSE) of 0.51 plus or minus 0.19 Newtons for force estimation, as evaluated using five-fold cross-validation. These results highlight the effectiveness of forearm ultrasound in advancing human-machine interfacing and robotic teleoperation for complex manipulation tasks. This work enables new and effective possibilities for human-robot skill transfer and tele-manipulation, bridging the gap between human dexterity and robotic control.
RONov 2, 2021
Household Cloth Object Set: Fostering Benchmarking in Deformable Object ManipulationIrene Garcia-Camacho, Júlia Borràs, Berk Calli et al.
Benchmarking of robotic manipulations is one of the open issues in robotic research. An important factor that has enabled progress in this area in the last decade is the existence of common object sets that have been shared among different research groups. However, the existing object sets are very limited when it comes to cloth-like objects that have unique particularities and challenges. This paper is a first step towards the design of a cloth object set to be distributed among research groups from the robotics cloth manipulation community. We present a set of household cloth objects and related tasks that serve to expose the challenges related to gathering such an object set and propose a roadmap to the design of common benchmarks in cloth manipulation tasks, with the intention to set the grounds for a future debate in the community that will be necessary to foster benchmarking for the manipulation of cloth-like objects. Some RGB-D and object scans are also collected as examples for the objects in relevant configurations. More details about the cloth set are shared in http://www.iri.upc.edu/groups/perception/ClothObjectSet/HouseholdClothSet.html.
ROAug 3, 2021
Research Challenges and Progress in Robotic Grasping and Manipulation CompetitionsYu Sun, Joe Falco, Maximo A. Roa et al.
This paper discusses recent research progress in robotic grasping and manipulation in the light of the latest Robotic Grasping and Manipulation Competitions (RGMCs). We first provide an overview of past benchmarks and competitions related to the robotics manipulation field. Then, we discuss the methodology behind designing the manipulation tasks in RGMCs. We provide a detailed analysis of key challenges for each task and identify the most difficult aspects based on the competing teams' performance in recent years. We believe that such an analysis is insightful to determine the future research directions for the robotic manipulation domain.
CVJun 4, 2021
ZeroWaste Dataset: Towards Deformable Object Segmentation in Cluttered ScenesDina Bashkirova, Mohamed Abdelfattah, Ziliang Zhu et al.
Less than 35% of recyclable waste is being actually recycled in the US, which leads to increased soil and sea pollution and is one of the major concerns of environmental researchers as well as the common public. At the heart of the problem are the inefficiencies of the waste sorting process (separating paper, plastic, metal, glass, etc.) due to the extremely complex and cluttered nature of the waste stream. Recyclable waste detection poses a unique computer vision challenge as it requires detection of highly deformable and often translucent objects in cluttered scenes without the kind of context information usually present in human-centric datasets. This challenging computer vision task currently lacks suitable datasets or methods in the available literature. In this paper, we take a step towards computer-aided waste detection and present the first in-the-wild industrial-grade waste detection and segmentation dataset, ZeroWaste. We believe that ZeroWaste will catalyze research in object detection and semantic segmentation in extreme clutter as well as applications in the recycling domain. Our project page can be found at http://ai.bu.edu/zerowaste/.
RONov 13, 2020
Region-Based Planning for 3D Within-Hand-Manipulation via Variable Friction Robot Fingers and Extrinsic ContactsAlp Sahin, Adam J. Spiers, Berk Calli
Attempts to achieve robotic Within-Hand-Manipulation (WIHM) generally utilize either high-DOF robotic hands with elaborate sensing apparatus or multi-arm robotic systems. In prior work we presented a simple robot hand with variable friction robot fingers, which allow a low-complexity approach to within-hand object translation and rotation, though this manipulation was limited to planar actions. In this work we extend the capabilities of this system to 3D manipulation with a novel region-based WIHM planning algorithm and utilizing extrinsic contacts. The ability to modulate finger friction enhances extrinsic dexterity for three-dimensional WIHM, and allows us to operate in the quasi-static level. The region-based planner automatically generates 3D manipulation sequences with a modified A* formulation that navigates the contact regions between the fingers and the object surface to reach desired regions. Central to this method is a set of object-motion primitives (i.e. within-hand sliding, rotation and pivoting), which can easily be achieved via changing contact friction. A wide range of goal regions can be achieved via this approach, which is demonstrated via real robot experiments following a standardized in-hand manipulation benchmarking protocol.
ROFeb 10, 2015
Benchmarking in Manipulation Research: The YCB Object and Model Set and Benchmarking ProtocolsBerk Calli, Aaron Walsman, Arjun Singh et al.
In this paper we present the Yale-CMU-Berkeley (YCB) Object and Model set, intended to be used to facilitate benchmarking in robotic manipulation, prosthetic design and rehabilitation research. The objects in the set are designed to cover a wide range of aspects of the manipulation problem; it includes objects of daily life with different shapes, sizes, textures, weight and rigidity, as well as some widely used manipulation tests. The associated database provides high-resolution RGBD scans, physical properties, and geometric models of the objects for easy incorporation into manipulation and planning software platforms. In addition to describing the objects and models in the set along with how they were chosen and derived, we provide a framework and a number of example task protocols, laying out how the set can be used to quantitatively evaluate a range of manipulation approaches including planning, learning, mechanical design, control, and many others. A comprehensive literature survey on existing benchmarks and object datasets is also presented and their scope and limitations are discussed. The set will be freely distributed to research groups worldwide at a series of tutorials at robotics conferences, and will be otherwise available at a reasonable purchase cost. It is our hope that the ready availability of this set along with the ground laid in terms of protocol templates will enable the community of manipulation researchers to more easily compare approaches as well as continually evolve benchmarking tests as the field matures.