ROMay 6
Contact-Free Grasp Stability Prediction with In-Hand Time-of-Flight SensorsKyle DuFrene, Cindy Grimm
Current approaches to grasp planning for robotics demonstrate high success rates, but degrade with noisy sensors and other factors. Previous works have proposed tactile-based grasp stability classifiers to detect failures, but these approaches rely on making contact and grasping the object to do so. We propose a contact-free grasp stability predictor using multi-zone time-of-flight sensors mounted in the distal links of a gripper. Our method, as it does not require grasping the object to make a prediction, significantly speeds up the stability classification process, cycling at 15 Hz. We collected over 2,500 real-world grasps across 15 objects to train a classifier. Additionally, we conducted grasp attempts over six additional unseen objects, three for validation and model selection, and three for model testing. Our approach demonstrated strong classification performance, with an accuracy of 85.5% on validation and 86.0% on test objects.
CVApr 14, 2025Code
SeeTree -- A modular, open-source system for tree detection and orchard localizationJostan Brown, Cindy Grimm, Joseph R. Davidson
Accurate localization is an important functional requirement for precision orchard management. However, there are few off-the-shelf commercial solutions available to growers. In this paper, we present SeeTree, a modular, open source embedded system for tree trunk detection and orchard localization that is deployable on any vehicle. Building on our prior work on vision-based in-row localization using particle filters, SeeTree includes several new capabilities. First, it provides capacity for full orchard localization including out-of-row headland turning. Second, it includes the flexibility to integrate either visual, GNSS, or wheel odometry in the motion model. During field experiments in a commercial orchard, the system converged to the correct location 99% of the time over 800 trials, even when starting with large uncertainty in the initial particle locations. When turning out of row, the system correctly tracked 99% of the turns (860 trials representing 43 unique row changes). To help support adoption and future research and development, we make our dataset, design files, and source code freely available to the community.
ROMar 4, 2021Code
Semantics-guided Skeletonization of Sweet Cherry Trees for Robotic PruningAlexander You, Cindy Grimm, Abhisesh Silwal et al.
Dormant pruning for fresh market fruit trees is a relatively unexplored application of agricultural robotics for which few end-to-end systems exist. One of the biggest challenges in creating an autonomous pruning system is the need to reconstruct a model of a tree which is accurate and informative enough to be useful for deciding where to cut. One useful structure for modeling a tree is a skeleton: a 1D, lightweight representation of the geometry and the topology of a tree. This skeletonization problem is an important one within the field of computer graphics, and a number of algorithms have been specifically developed for the task of modeling trees. These skeletonization algorithms have largely addressed the problem as a geometric one. In agricultural contexts, however, the parts of the tree have distinct labels, such as the trunk, supporting branches, etc. This labeled structure is important for understanding where to prune. We introduce an algorithm which produces such a labeled skeleton, using the topological and geometric priors associated with these labels to improve our skeletons. We test our skeletonization algorithm on point clouds from 29 upright fruiting offshoot (UFO) trees and demonstrate a median accuracy of 70% with respect to a human-evaluated gold standard. We also make point cloud scans of 82 UFO trees open-source to other researchers. Our work represents a significant first step towards a robust tree modeling framework which can be used in an autonomous pruning system.
CVApr 23, 2024
Machine Vision-Based Assessment of Fall Color Changes and its Relationship with Leaf Nitrogen ConcentrationAchyut Paudel, Jostan Brown, Priyanka Upadhyaya et al.
Apple(\textit{Malus domestica} Borkh.) trees are deciduous, shedding leaves each year. This process is preceded by a gradual change in leaf color from green to yellow as chlorophyll is degraded prior to abscission. The initiation and rate of this color change are affected by many factors including leaf nitrogen (N) concentration. We predict that leaf color during this transition may be indicative of the nitrogen status of apple trees. This study assesses a machine vision-based system for quantifying the change in leaf color and its correlation with leaf nitrogen content. An image dataset was collected in color and 3D over five weeks in the fall of 2021 and 2023 at a commercial orchard using a ground vehicle-based stereovision sensor. Trees in the foreground were segmented from the point cloud using color and depth thresholding methods. Then, to estimate the proportion of yellow leaves per canopy, the color information of the segmented canopy area was quantified using a custom-defined metric, \textit{yellowness index} (a normalized ratio of yellow to green foliage in the tree) that varied from -1 to +1 (-1 being completely green and +1 being completely yellow). Both K-means-based methods and gradient boosting methods were used to estimate the \textit{yellowness index}. The gradient boosting based method proposed in this study was better than the K-means-based method (both in terms of computational time and accuracy), achieving an $R^2$ of 0.72 in estimating the \textit{yellowness index}. The metric was able to capture the gradual color transition from green to yellow over the study duration. Trees with lower leaf nitrogen showed the color transition to yellow earlier than the trees with higher nitrogen. Keywords: Fruit Tree Nitrogen Management, Machine Vision, Point Cloud Segmentation, Precision Nitrogen Management
ROJul 30, 2025
Learning to Prune Branches in Modern Tree-Fruit OrchardsAbhinav Jain, Cindy Grimm, Stefan Lee
Dormant tree pruning is labor-intensive but essential to maintaining modern highly-productive fruit orchards. In this work we present a closed-loop visuomotor controller for robotic pruning. The controller guides the cutter through a cluttered tree environment to reach a specified cut point and ensures the cutters are perpendicular to the branch. We train the controller using a novel orchard simulation that captures the geometric distribution of branches in a target apple orchard configuration. Unlike traditional methods requiring full 3D reconstruction, our controller uses just optical flow images from a wrist-mounted camera. We deploy our learned policy in simulation and the real-world for an example V-Trellis envy tree with zero-shot transfer, achieving a 30% success rate -- approximately half the performance of an oracle planner.
CVFeb 26, 2022
Optical flow-based branch segmentation for complex orchard environmentsAlexander You, Cindy Grimm, Joseph R. Davidson
Machine vision is a critical subsystem for enabling robots to be able to perform a variety of tasks in orchard environments. However, orchards are highly visually complex environments, and computer vision algorithms operating in them must be able to contend with variable lighting conditions and background noise. Past work on enabling deep learning algorithms to operate in these environments has typically required large amounts of hand-labeled data to train a deep neural network or physically controlling the conditions under which the environment is perceived. In this paper, we train a neural network system in simulation only using simulated RGB data and optical flow. This resulting neural network is able to perform foreground segmentation of branches in a busy orchard environment without additional real-world training or using any special setup or equipment beyond a standard camera. Our results show that our system is highly accurate and, when compared to a network using manually labeled RGBD data, achieves significantly more consistent and robust performance across environments that differ from the training set.
RODec 1, 2021
Effects of Interfaces on Human-Robot Trust: Specifying and Visualizing Physical ZonesMarisa Hudspeth, Sogol Balali, Cindy Grimm et al.
In this paper we investigate the influence interfaces and feedback have on human-robot trust levels when operating in a shared physical space. The task we use is specifying a "no-go" region for a robot in an indoor environment. We evaluate three styles of interface (physical, AR, and map-based) and four feedback mechanisms (no feedback, robot drives around the space, an AR "fence", and the region marked on the map). Our evaluation looks at both usability and trust. Specifically, if the participant trusts that the robot "knows" where the no-go region is and their confidence in the robot's ability to avoid that region. We use both self-reported and indirect measures of trust and usability. Our key findings are: 1) interfaces and feedback do influence levels of trust; 2) the participants largely preferred a mixed interface-feedback pair, where the modality for the interface differed from the feedback.
ROSep 27, 2021
Precision fruit tree pruning using a learned hybrid vision/interaction controllerAlexander You, Hannah Kolano, Nidhi Parayil et al.
Robotic tree pruning requires highly precise manipulator control in order to accurately align a cutting implement with the desired pruning point at the correct angle. Simultaneously, the robot must avoid applying excessive force to rigid parts of the environment such as trees, support posts, and wires. In this paper, we propose a hybrid control system that uses a learned vision-based controller to initially align the cutter with the desired pruning point, taking in images of the environment and outputting control actions. This controller is trained entirely in simulation, but transfers easily to real trees via a neural network which transforms raw images into a simplified, segmented representation. Once contact is established, the system hands over control to an interaction controller that guides the cutter pivot point to the branch while minimizing interaction forces. With this simple, yet novel, approach we demonstrate an improvement of over 30 percentage points in accuracy over a baseline controller that uses camera depth data.
ROJun 19, 2021
Grasping Benchmarks: Normalizing for Object Size \& Approximating Hand WorkspacesJohn Morrow, Nuha Nishat, Joshua Campbell et al.
The varied landscape of robotic hand designs makes it difficult to set a standard for how to measure hand size and to communicate the size of objects it can grasp. Defining consistent workspace measurements would greatly assist scientific communication in robotic grasping research because it would allow researchers to 1) quantitatively communicate an object's relative size to a hand's and 2) approximate a functional subspace of a hand's kinematic workspace in a human-readable way. The goal of this paper is to specify a measurement procedure that quantitatively captures a hand's workspace size for both a precision and power grasp. This measurement procedure uses a {\em functional} approach -- based on a generic grasping scenario of a hypothetical object -- in order to make the procedure as generalizable and repeatable as possible, regardless of the actual hand design. This functional approach lets the measurer choose the exact finger configurations and contact points that satisfy the generic grasping scenario, while ensuring that the measurements are {\em functionally} comparable. We demonstrate these functional measurements on seven hand configurations. Additional hand measurements and instructions are provided in a GitHub Repository.
HCApr 6, 2020
Analyzing 3D Volume Segmentation by Low-level Perceptual Cues, High-level Cognitive Tasks, and Decision-making ProcessesAnahita Sanandaji, Cindy Grimm, Ruth West et al.
3D volume segmentation is a fundamental task in many scientific and medical applications. Producing accurate segmentations efficiently is challenging, in part due to low imaging data quality (e.g., noise and low image resolution) and ambiguity in the data that can only be resolved with higher-level knowledge of the structure. Automatic algorithms do exist, but there are many use cases where they fail. The gold standard is still manual segmentation or review. Unfortunately, even for an expert, manual segmentation is laborious, time consuming, and prone to errors. Existing 3D segmentation tools are often designed based on the underlying algorithm, and do not take into account human mental models, their lower-level perception abilities, and higher-level cognitive tasks. Our goal is to analyze manual segmentation using the critical decision method (CDM) in order to gain a better understanding of the low-level (perceptual and marking) actions and higher-level decision-making processes that segmenters use. A key challenge we faced is that decision-making consists of an accumulated set of low-level visual-spatial decisions that are inter-related and difficult to articulate verbally. To address this, we developed a novel hybrid protocol which integrates CDM with eye-tracking, observation, and targeted questions. In this paper, we develop and validate data coding schemes for this hybrid data set that discern segmenters' low-level actions, higher-level cognitive tasks, overall task structures, and decision-making processes. We successfully detect the visual processing changes based on tasks sequences and micro decisions reflected in the eye-gaze data and identified different segmentation decision strategies utilized by the segmenters.
HCJan 18, 2020
Developing and Validating an Interactive Training Tool for Inferring 2D Cross-Sections of Complex 3D StructuresAnahita Sanandaji, Cindy Grimm, Ruth West et al.
Understanding 2D cross-sections of 3D structures is a crucial skill in many disciplines, from geology to medical imaging. Cross-section inference in the context of 3D structures requires a complex set of spatial/visualization skills including mental rotation, spatial structure understanding, and viewpoint projection. Prior studies show that experts differ from novices in these, and other, skill dimensions. Building on a previously developed model that hierarchically characterizes the specific spatial sub-skills needed for this task, we have developed the first domain-agnostic, computer-based training tool for cross-section understanding of complex 3D structures. We demonstrate, in an evaluation with 60 participants, that this interactive tool is effective for increasing cross-section inference skills for a variety of structures, from simple primitive ones to more complex biological structures.
HCJun 15, 2016
Sketched Floor plans versus SLAM maps: A ComparisonLeo Bowen-Biggs, Suzanne Dazo, Yili Zhang et al.
Maps --- specifically floor plans --- are useful for a variety of tasks from arranging furniture to designating conceptual or functional spaces (e.g., kitchen, walkway). We present a simple algorithm for quickly laying a floor plan (or other conceptual map) onto a SLAM map, creating a one-to-one mapping between them. Our goal was to enable using a floor plan (or other hand-drawn or annotated map) in robotic applications instead of the typical SLAM map created by the robot. We look at two use cases, specifying "no-go" regions within a room and locating objects within a scanned room. Although a user study showed no statistical difference between the two types of maps in terms of performance on this spatial memory task, we argue that floor plans are closer to the mental maps people would naturally draw to characterize spaces.
ROJan 13, 2015
Video Manipulation Techniques for the Protection of Privacy in Remote Presence SystemsAlexander Hubers, Emily Andrulis, Levi Scott et al.
Systems that give control of a mobile robot to a remote user raise privacy concerns about what the remote user can see and do through the robot. We aim to preserve some of that privacy by manipulating the video data that the remote user sees. Through two user studies, we explore the effectiveness of different video manipulation techniques at providing different types of privacy. We simultaneously examine task performance in the presence of privacy protection. In the first study, participants were asked to watch a video captured by a robot exploring an office environment and to complete a series of observational tasks under differing video manipulation conditions. Our results show that using manipulations of the video stream can lead to fewer privacy violations for different privacy types. Through a second user study, it was demonstrated that these privacy-protecting techniques were effective without diminishing the task performance of the remote user.