Gregory S. Chirikjian

h-index55

29papers

382citations

Novelty48%

AI Score49

Ranked #26,103 of 194,257 authors (top 13%)#661 in RO (top 10%)

29 Papers

13.1CVMar 23, 2023Code

Marching-Primitives: Shape Abstraction from Signed Distance Function

Weixiao Liu, Yuwei Wu, Sipu Ruan et al.

Representing complex objects with basic geometric primitives has long been a topic in computer vision. Primitive-based representations have the merits of compactness and computational efficiency in higher-level tasks such as physics simulation, collision checking, and robotic manipulation. Unlike previous works which extract polygonal meshes from a signed distance function (SDF), in this paper, we present a novel method, named Marching-Primitives, to obtain a primitive-based abstraction directly from an SDF. Our method grows geometric primitives (such as superquadrics) iteratively by analyzing the connectivity of voxels while marching at different levels of signed distance. For each valid connected volume of interest, we march on the scope of voxels from which a primitive is able to be extracted in a probabilistic sense and simultaneously solve for the parameters of the primitive to capture the underlying local geometry. We evaluate the performance of our method on both synthetic and real-world datasets. The results show that the proposed method outperforms the state-of-the-art in terms of accuracy, and is directly generalizable among different categories and scales. The code is open-sourced at https://github.com/ChirikjianLab/Marching-Primitives.git.

7.0ROJun 1Code

Hierarchical Object Representation for Spatial Robot Perception: Points, Meshes, and Superquadrics

Ceng Zhang, Wan Su, Mohamed Samshad et al.

Hierarchical 3D Scene Graphs (3DSG) have emerged as an actionable and scalable representation for long-term autonomy incorporating metric, semantic, and topological information in the scene. However, the question of geometric representation of objects in 3DSG has been overlooked as most methods use simplified geometric models such as partial point clouds or 3D bounding boxes. In this work, we introduce a hierarchical object representation that can be leveraged for high-fidelity object-level reconstruction, object-based robust re-localization or map alignment, and efficient and analytical collision checking for safe robot navigation planning in dense and cluttered environments. The representation is structurally organized into four distinct layers, progressively abstracting the scene from raw sensor data to dense 3D meshes to analytical primitives such as superquadrics, which provide a sparse and analytical representation for object geometry. We develop a pipeline that builds the hierarchical object representation from RGB-D image stream captured by a robot, and demonstrate its working in real-world open-set object scenes in both indoor and outdoor environments. Extensive experiments across diverse datasets including HOPE, ReplicaCAD, Kimera-Multi, and NUS Campus Dataset collected using Unitree B2 Robot validate our pipeline in both indoor and outdoor environments. We show that our superquadric-based map alignment method outperforms the current state-of-the-art object based map alignment method ROMAN. Our code can be found at https://github.com/perceptica-robotics/Hickory.

13.2CVMar 28, 2022

Primitive-based Shape Abstraction via Nonparametric Bayesian Inference

Yuwei Wu, Weixiao Liu, Sipu Ruan et al.

3D shape abstraction has drawn great interest over the years. Apart from low-level representations such as meshes and voxels, researchers also seek to semantically abstract complex objects with basic geometric primitives. Recent deep learning methods rely heavily on datasets, with limited generality to unseen categories. Furthermore, abstracting an object accurately yet with a small number of primitives still remains a challenge. In this paper, we propose a novel non-parametric Bayesian statistical method to infer an abstraction, consisting of an unknown number of geometric primitives, from a point cloud. We model the generation of points as observations sampled from an infinite mixture of Gaussian Superquadric Taper Models (GSTM). Our approach formulates the abstraction as a clustering problem, in which: 1) each point is assigned to a cluster via the Chinese Restaurant Process (CRP); 2) a primitive representation is optimized for each cluster, and 3) a merging post-process is incorporated to provide a concise representation. We conduct extensive experiments on two datasets. The results indicate that our method outperforms the state-of-the-art in terms of accuracy and is generalizable to various types of objects.

1.2FAOct 29, 2018

Fourier-Zernike Series of Convolutions on Disks

Arash Ghaani Farashahi, Gregory S. Chirikjian

This paper presents a systematic study for analytic aspects of Fourier-Zernike series of convolutions of functions supported on disks. We then investigate different aspects of the presented theory in the cases of zero-padded functions.

4.1ROSep 18, 2024

RaggeDi: Diffusion-based State Estimation of Disordered Rags, Sheets, Towels and Blankets

Jikai Ye, Wanze Li, Shiraz Khan et al.

Cloth state estimation is an important problem in robotics. It is essential for the robot to know the accurate state to manipulate cloth and execute tasks such as robotic dressing, stitching, and covering/uncovering human beings. However, estimating cloth state accurately remains challenging due to its high flexibility and self-occlusion. This paper proposes a diffusion model-based pipeline that formulates the cloth state estimation as an image generation problem by representing the cloth state as an RGB image that describes the point-wise translation (translation map) between a pre-defined flattened mesh and the deformed mesh in a canonical space. Then we train a conditional diffusion-based image generation model to predict the translation map based on an observation. Experiments are conducted in both simulation and the real world to validate the performance of our method. Results indicate that our method outperforms two recent methods in both accuracy and speed.

1.2NAOct 3, 2007

Deblurring of Motionally Averaged Images with Applications to Single-Particle Cryo-Electron Microscopy

Wooram Park, Daniel N. Rockmore, Dean Madden et al.

This paper addresses the deconvolution of an image that has been obtained by superimposing many copies of an underlying unknown image of interest. The superposition is assumed to not be exact due to noise, and is described using an error distribution in position, orientation, or both. We assume that a good estimate of the error distribution is known. The most natural approach to take for the purely translational case is to apply the Fourier transform and use the classical convolution theorem together with a Weiner filter to invert. In the case of purely rotational deblurring, the similar Fourier analysis is applied. That is, for an blurred image function defined on polar coordinates, the Fourier series and the convolution theorem for the series can be applied. In the case of combinations of translational and rotational errors, the motion-group Fourier transform is used. In addition, for the three cases we present the alternative method using Hermite and Laguerre-Fourier expansion, which has a special property in Fourier transform. The problem that is solved here is motivated by one of the steps in the cryo-electron-tomographic reconstruction of biomolecular complexes such as viruses and ion channels.

12.6CVNov 29, 2021Code

Robust and Accurate Superquadric Recovery: a Probabilistic Approach

Weixiao Liu, Yuwei Wu, Sipu Ruan et al.

Interpreting objects with basic geometric primitives has long been studied in computer vision. Among geometric primitives, superquadrics are well known for their ability to represent a wide range of shapes with few parameters. However, as the first and foremost step, recovering superquadrics accurately and robustly from 3D data still remains challenging. The existing methods are subject to local optima and sensitive to noise and outliers in real-world scenarios, resulting in frequent failure in capturing geometric shapes. In this paper, we propose the first probabilistic method to recover superquadrics from point clouds. Our method builds a Gaussian-uniform mixture model (GUM) on the parametric surface of a superquadric, which explicitly models the generation of outliers and noise. The superquadric recovery is formulated as a Maximum Likelihood Estimation (MLE) problem. We propose an algorithm, Expectation, Maximization, and Switching (EMS), to solve this problem, where: (1) outliers are predicted from the posterior perspective; (2) the superquadric parameter is optimized by the trust-region reflective algorithm; and (3) local optima are avoided by globally searching and switching among parameters encoding similar superquadrics. We show that our method can be extended to the multi-superquadrics recovery for complex objects. The proposed method outperforms the state-of-the-art in terms of accuracy, efficiency, and robustness on both synthetic and real-world datasets. The code is at http://github.com/bmlklwx/EMS-superquadric_fitting.git.

5.3ROApr 14, 2021Code

Look at my new blue force-sensing shoes!

Yuanfeng Han, Ruixin Li, Gregory S. Chirikjian

To function autonomously in the physical world, humanoid robots need high-fidelity sensing systems, especially for forces that cannot be easily modeled. Modeling forces in robot feet is particularly challenging due to static indeterminacy, thereby requiring direct sensing. Unfortunately, resolving forces in the feet of some smaller-sized humanoids is limited both by the quality of sensors and the current algorithms used to interpret the data. This paper presents light-weight, low-cost and open-source force-sensing shoes to improve force measurement for popular smaller-sized humanoid robots, and a method for calibrating the shoes. The shoes measure center of pressure (CoP) and normal ground reaction force (GRF). The calibration method enables each individual shoe to reach high measurement precision by applying known forces at different locations of the shoe and using a regularized least squares optimization to interpret sensor outputs. A NAO robot is used as our experimental platform. Experiments are conducted to compare the measurement performance between the shoes and the robot's factory-installed force-sensing resistors (FSRs), and to evaluate the calibration method over these two sensing modules. Experimental results show that the shoes significantly improve CoP and GRF measurement precision compared to the robot's built-in FSRs. Moreover, the developed calibration method improves the measurement performance for both our shoes and the built-in FSRs.

10.6CVApr 12, 2021Code

Towards Efficient Graph Convolutional Networks for Point Cloud Handling

Yawei Li, He Chen, Zhaopeng Cui et al.

In this paper, we aim at improving the computational efficiency of graph convolutional networks (GCNs) for learning on point clouds. The basic graph convolution that is typically composed of a $K$-nearest neighbor (KNN) search and a multilayer perceptron (MLP) is examined. By mathematically analyzing the operations there, two findings to improve the efficiency of GCNs are obtained. (1) The local geometric structure information of 3D representations propagates smoothly across the GCN that relies on KNN search to gather neighborhood features. This motivates the simplification of multiple KNN searches in GCNs. (2) Shuffling the order of graph feature gathering and an MLP leads to equivalent or similar composite operations. Based on those findings, we optimize the computational procedure in GCNs. A series of experiments show that the optimized networks have reduced computational complexity, decreased memory consumption, and accelerated inference speed while maintaining comparable accuracy for learning on point clouds. Code will be available at \url{https://github.com/ofsoundof/EfficientGCN.git}.

9.4CVMar 28, 2021Code

LSG-CPD: Coherent Point Drift with Local Surface Geometry for Point Cloud Registration

Weixiao Liu, Hongtao Wu, Gregory Chirikjian

Probabilistic point cloud registration methods are becoming more popular because of their robustness. However, unlike point-to-plane variants of iterative closest point (ICP) which incorporate local surface geometric information such as surface normals, most probabilistic methods (e.g., coherent point drift (CPD)) ignore such information and build Gaussian mixture models (GMMs) with isotropic Gaussian covariances. This results in sphere-like GMM components which only penalize the point-to-point distance between the two point clouds. In this paper, we propose a novel method called CPD with Local Surface Geometry (LSG-CPD) for rigid point cloud registration. Our method adaptively adds different levels of point-to-plane penalization on top of the point-to-point penalization based on the flatness of the local surface. This results in GMM components with anisotropic covariances. We formulate point cloud registration as a maximum likelihood estimation (MLE) problem and solve it with the Expectation-Maximization (EM) algorithm. In the E step, we demonstrate that the computation can be recast into simple matrix manipulations and efficiently computed on a GPU. In the M step, we perform an unconstrained optimization on a matrix Lie group to efficiently update the rigid transformation of the registration. The proposed method outperforms state-of-the-art algorithms in terms of accuracy and robustness on various datasets captured with range scanners, RGBD cameras, and LiDARs. Also, it is significantly faster than modern implementations of CPD. The source code is available at https://github.com/ChirikjianLab/LSG-CPD.git.

8.4CVMar 26, 2025Code

Reasoning and Learning a Perceptual Metric for Self-Training of Reflective Objects in Bin-Picking with a Low-cost Camera

Peiyuan Ni, Chee Meng Chew, Marcelo H. Ang et al.

Bin-picking of metal objects using low-cost RGB-D cameras often suffers from sparse depth information and reflective surface textures, leading to errors and the need for manual labeling. To reduce human intervention, we propose a two-stage framework consisting of a metric learning stage and a self-training stage. Specifically, to automatically process data captured by a low-cost camera (LC), we introduce a Multi-object Pose Reasoning (MoPR) algorithm that optimizes pose hypotheses under depth, collision, and boundary constraints. To further refine pose candidates, we adopt a Symmetry-aware Lie-group based Bayesian Gaussian Mixture Model (SaL-BGMM), integrated with the Expectation-Maximization (EM) algorithm, for symmetry-aware filtering. Additionally, we propose a Weighted Ranking Information Noise Contrastive Estimation (WR-InfoNCE) loss to enable the LC to learn a perceptual metric from reconstructed data, supporting self-training on untrained or even unseen objects. Experimental results show that our approach outperforms several state-of-the-art methods on both the ROBI dataset and our newly introduced Self-ROBI dataset.

3.2ROSep 4, 2025

INGRID: Intelligent Generative Robotic Design Using Large Language Models

Guanglu Jia, Ceng Zhang, Gregory S. Chirikjian

The integration of large language models (LLMs) into robotic systems has accelerated progress in embodied artificial intelligence, yet current approaches remain constrained by existing robotic architectures, particularly serial mechanisms. This hardware dependency fundamentally limits the scope of robotic intelligence. Here, we present INGRID (Intelligent Generative Robotic Design), a framework that enables the automated design of parallel robotic mechanisms through deep integration with reciprocal screw theory and kinematic synthesis methods. We decompose the design challenge into four progressive tasks: constraint analysis, kinematic joint generation, chain construction, and complete mechanism design. INGRID demonstrates the ability to generate novel parallel mechanisms with both fixed and variable mobility, discovering kinematic configurations not previously documented in the literature. We validate our approach through three case studies demonstrating how INGRID assists users in designing task-specific parallel robots based on desired mobility requirements. By bridging the gap between mechanism theory and machine learning, INGRID enables researchers without specialized robotics training to create custom parallel mechanisms, thereby decoupling advances in robotic intelligence from hardware constraints. This work establishes a foundation for mechanism intelligence, where AI systems actively design robotic hardware, potentially transforming the development of embodied AI systems.

2.2ROFeb 26, 2022

Watch Me Calibrate My Force-Sensing Shoes!

Yuanfeng Han, Boren Jiang, Gregory S. Chirikjian

This paper presents a novel method for smaller-sized humanoid robots to self-calibrate their foot force sensors. The method consists of two steps: 1. The robot is commanded to move along planned whole-body trajectories in different double support configurations. 2. The sensor parameters are determined by minimizing the error between the measured and modeled center of pressure (CoP) and ground reaction force (GRF) during the robot's movement using optimization. This is the first proposed autonomous calibration method for foot force-sensing devices in smaller humanoid robots. Furthermore, we introduce a high-accuracy manual calibration method to establish CoP ground truth, which is used to validate the measured CoP using self-calibration. The results show that the self-calibration can accurately estimate CoP and GRF without any manual intervention. Our method is demonstrated using a NAO humanoid platform and our previously presented force-sensing shoes.

6.9ROFeb 22, 2022

Transporters with Visual Foresight for Solving Unseen Rearrangement Tasks

Hongtao Wu, Jikai Ye, Xin Meng et al.

Rearrangement tasks have been identified as a crucial challenge for intelligent robotic manipulation, but few methods allow for precise construction of unseen structures. We propose a visual foresight model for pick-and-place rearrangement manipulation which is able to learn efficiently. In addition, we develop a multi-modal action proposal module which builds on the Goal-Conditioned Transporter Network, a state-of-the-art imitation learning method. Our image-based task planning method, Transporters with Visual Foresight, is able to learn from only a handful of data and generalize to multiple unseen tasks in a zero-shot manner. TVF is able to improve the performance of a state-of-the-art imitation learning method on unseen tasks in simulation and real robot experiments. In particular, the average success rate on unseen tasks improves from 55.4% to 78.5% in simulation experiments and from 30% to 63.3% in real robot experiments when given only tens of expert demonstrations. Video and code are available on our project website: https://chirikjianlab.github.io/tvf/

2.2ROFeb 8, 2022

When Kinematics Dominates Mechanics: Locally Volume-Preserving Primitives for Model Reduction in Finite Elasticity

Xu Yi, Gregory S. Chirikjian

A new, and extremely fast, computational modeling paradigm is introduced here for specific finite elasticity problems that arise in the context of soft robotics. Whereas continuum mechanics is a very classical area of study, and significant effort has been devoted to the development of intricate constitutive models for finite elasticity, we show that in the kinds of large-strain mechanics problems arising in soft robotics, many of the parameters in constitutive models are irrelevant. For the most part, the isochoric (locally volume-preserving) constraint dominates behavior, and this can be built into closed-form kinematic deformation fields before even considering other aspects of constitutive modeling. We therefore focus on developing and applying primitive deformations that each observe this constraint. It is shown that by composing a wide enough variety of such deformations that the most common behaviors observed in soft robots can be replicated. Case studies include an inflatable rubber chamber, a slender rubber rod, and a rubber block subjected to different boundary conditions. We show that this method is at least 50 times faster than the ABAQUS implementation of the finite element method (FEM). Physical experiments and measurements show that both our method and ABAQUS have approximately 10% error relative to experimentally measured displacements, as well as to each other. Our method provides a real-time alternative to FEM, and captures essential degrees of freedom for use in feedback control systems.

3.0ROAug 12, 2021

Put the Bear on the Chair! Intelligent Robot Interaction with Previously Unseen Chairs via Robot Imagination

Hongtao Wu, Xin Meng, Sipu Ruan et al.

In this paper, we study the problem of autonomously seating a teddy bear on a previously unseen chair. To achieve this goal, we present a novel method for robots to imagine the sitting pose of the bear by physically simulating a virtual humanoid agent sitting on the chair. We also develop a robotic system which leverages motion planning to plan SE(2) motions for a humanoid robot to walk to the chair and whole-body motions to put the bear on it. Furthermore, to cope with cases where the chair is not in an accessible pose for placing the bear, a human assistance module is introduced for a human to follow language instructions given by the robot to rotate the chair and help make the chair accessible. We implement our method with a robot arm and a humanoid robot. We calibrate the proposed system with 3 chairs and test on 12 previously unseen chairs in both accessible and inaccessible poses extensively. Results show that our method enables the robot to autonomously seat the teddy bear on the 12 previously unseen chairs with a very high success rate. The human assistance module is also shown to be very effective in changing the accessibility of the chair. Video demos and more details are available at https://chirikjianlab.github.io/putbearonchair/.

3.0ROApr 20, 2021

A Learning-Based Approach for Estimating Inertial Properties of Unknown Objects from Encoder Discrepancies

Zizhou Lao, Yuanfeng Han, Yunshan Ma et al.

Many robots utilize commercial force/torque sensors to identify inertial properties of unknown objects. However, such sensors can be difficult to apply to small-sized robots due to their weight, size, and cost. In this paper, we propose a learning-based approach for estimating the mass and center of mass (COM) of unknown objects without using force/torque sensors at the end-effector or on the joints. In our method, a robot arm carries an unknown object as it moves through multiple discrete configurations. Measurements are collected when the robot reaches each discrete configuration and stops. A neural network is designed to estimate joint torques from encoder discrepancies. Given multiple samples, we derive the closed-form relation between joint torques and the object's inertial properties. Based on the derivation, the mass and COM of object are identified by weighted least squares. In order to improve the accuracy of inferred inertial properties, an attention model is designed to generate weights of joints, which indicate the relative importance for each joint. Our framework requires only encoder measurements without using any force/torque sensors, but still maintains accurate estimation capability. The proposed approach has been demonstrated on a 4 degree of freedom (DOF) robot arm.

11.6ROApr 10, 2021Code

Efficient Path Planning in Narrow Passages for Robots with Ellipsoidal Components

Sipu Ruan, Karen L. Poblete, Hongtao Wu et al.

Path planning has long been one of the major research areas in robotics, with PRM and RRT being two of the most effective classes of planners. Though generally very efficient, these sampling-based planners can become computationally expensive in the important case of "narrow passages". This paper develops a path planning paradigm specifically formulated for narrow passage problems. The core is based on planning for rigid-body robots encapsulated by unions of ellipsoids. Each environmental feature is represented geometrically using a strictly convex body with a $\mathcal{C}^1$ boundary (e.g., superquadric). The main benefit of doing this is that configuration-space obstacles can be parameterized explicitly in closed form, thereby allowing prior knowledge to be used to avoid sampling infeasible configurations. Then, by characterizing a tight volume bound for multiple ellipsoids, robot transitions involving rotations are guaranteed to be collision-free without needing to perform traditional collision detection. Furthermore, by combining with a stochastic sampling strategy, the proposed planning framework can be extended to solving higher dimensional problems in which the robot has a moving base and articulated appendages. Benchmark results show that the proposed framework often outperforms the sampling-based planners in terms of computational time and success rate in finding a path through narrow corridors for both single-body robots and those with higher dimensional configuration spaces. Physical experiments using the proposed framework are further demonstrated on a humanoid robot that walks in several cluttered environments with narrow passages.

10.4ROAug 9, 2020

Can I lift it? Humanoid robot reasoning about the feasibility of lifting a heavy box with unknown physical properties

Yuanfeng Han, Ruixin Li, Gregory S. Chirikjian

A robot cannot lift up an object if it is not feasible to do so. However, in most research on robot lifting, "feasibility" is usually presumed to exist a priori. This paper proposes a three-step method for a humanoid robot to reason about the feasibility of lifting a heavy box with physical properties that are unknown to the robot. Since feasibility of lifting is directly related to the physical properties of the box, we first discretize a range for the unknown values of parameters describing these properties and tabulate all valid optimal quasi-static lifting trajectories generated by simulations over all combinations of indices. Second, a physical-interaction-based algorithm is introduced to identify the robust gripping position and physical parameters corresponding to the box. During this process, the stability and safety of the robot are ensured. On the basis of the above two steps, a third step of mapping operation is carried out to best match the estimated parameters to the indices in the table. The matched indices are then queried to determine whether a valid trajectory exists. If so, the lifting motion is feasible; otherwise, the robot decides that the task is beyond its capability. Our method efficiently evaluates the feasibility of a lifting task through simple interactions between the robot and the box, while simultaneously obtaining the desired safe and stable trajectory. We successfully demonstrated the proposed method using a NAO humanoid robot.

9.4ROAug 5, 2020

Can I Pour into It? Robot Imagining Open Containability Affordance of Previously Unseen Objects via Physical Simulations

Hongtao Wu, Gregory S. Chirikjian

Open containers, i.e., containers without covers, are an important and ubiquitous class of objects in human life. In this letter, we propose a novel method for robots to "imagine" the open containability affordance of a previously unseen object via physical simulations. We implement our imagination method on a UR5 manipulator. The robot autonomously scans the object with an RGB-D camera. The scanned 3D model is used for open containability imagination which quantifies the open containability affordance by physically simulating dropping particles onto the object and counting how many particles are retained in it. This quantification is used for open-container vs. non-open-container binary classification (hereafter referred to as open container classification). If the object is classified as an open container, the robot further imagines pouring into the object, again using physical simulations, to obtain the pouring position and orientation for real robot autonomous pouring. We evaluate our method on open container classification and autonomous pouring of granular material on a dataset containing 130 previously unseen objects with 57 object categories. Although our proposed method uses only 11 objects for simulation calibration (training), its open container classification aligns well with human judgements. In addition, our method endows the robot with the capability to autonomously pour into the 55 containers in the dataset with a very high success rate. We also compare to a deep learning method. Results show that our method achieves the same performance as the deep learning method on open container classification and outperforms it on autonomous pouring. Moreover, our method is fully explainable.

10.1CVJul 21, 2020Code

Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry

He Chen, Pengfei Guo, Pengfei Li et al.

Epipolar constraints are at the core of feature matching and depth estimation in current multi-person multi-camera 3D human pose estimation methods. Despite the satisfactory performance of this formulation in sparser crowd scenes, its effectiveness is frequently challenged under denser crowd circumstances mainly due to two sources of ambiguity. The first is the mismatch of human joints resulting from the simple cues provided by the Euclidean distances between joints and epipolar lines. The second is the lack of robustness from the naive formulation of the problem as a least squares minimization. In this paper, we depart from the multi-person 3D pose estimation formulation, and instead reformulate it as crowd pose estimation. Our method consists of two key components: a graph model for fast cross-view matching, and a maximum a posteriori (MAP) estimator for the reconstruction of the 3D human poses. We demonstrate the effectiveness and superiority of our proposed method on four benchmark datasets.

2.2ROApr 12, 2020

A Mosquito Pick-and-Place System for PfSPZ-based Malaria Vaccine Production

Henry Phalen, Prasad Vagdargi, Mariah L. Schrum et al.

The treatment of malaria is a global health challenge that stands to benefit from the widespread introduction of a vaccine for the disease. A method has been developed to create a live organism vaccine using the sporozoites (SPZ) of the parasite Plasmodium falciparum (Pf), which are concentrated in the salivary glands of infected mosquitoes. Current manual dissection methods to obtain these PfSPZ are not optimally efficient for large-scale vaccine production. We propose an improved dissection procedure and a mechanical fixture that increases the rate of mosquito dissection and helps to deskill this stage of the production process. We further demonstrate the automation of a key step in this production process, the picking and placing of mosquitoes from a staging apparatus into a dissection assembly. This unit test of a robotic mosquito pick-and-place system is performed using a custom-designed micro-gripper attached to a four degree of freedom (4-DOF) robot under the guidance of a computer vision system. Mosquitoes are autonomously grasped and pulled to a pair of notched dissection blades to remove the head of the mosquito, allowing access to the salivary glands. Placement into these blades is adapted based on output from computer vision to accommodate for the unique anatomy and orientation of each grasped mosquito. In this pilot test of the system on 50 mosquitoes, we demonstrate a 100% grasping accuracy and a 90% accuracy in placing the mosquito with its neck within the blade notches such that the head can be removed. This is a promising result for this difficult and non-standard pick-and-place task.

9.2ROSep 17, 2019

Is That a Chair? Imagining Affordances Using Simulations of an Articulated Human Body

Hongtao Wu, Deven Misra, Gregory S. Chirikjian

For robots to exhibit a high level of intelligence in the real world, they must be able to assess objects for which they have no prior knowledge. Therefore, it is crucial for robots to perceive object affordances by reasoning about physical interactions with the object. In this paper, we propose a novel method to provide robots with an ability to imagine object affordances using physical simulations. The class of chair is chosen here as an initial category of objects to illustrate a more general paradigm. In our method, the robot "imagines" the affordance of an arbitrarily oriented object as a chair by simulating a physical sitting interaction between an articulated human body and the object. This object affordance reasoning is used as a cue for object classification (chair vs non-chair). Moreover, if an object is classified as a chair, the affordance reasoning can also predict the upright pose of the object which allows the sitting interaction to take place. We call this type of poses the functional pose. We demonstrate our method in chair classification on synthetic 3D CAD models. Although our method uses only 30 models for training, it outperforms appearance-based deep learning methods, which require a large amount of training data, when the upright orientation is not assumed to be known a priori. In addition, we showcase that the functional pose predictions of our method align well with human judgments on both synthetic models and real objects scanned by a depth camera.

2.6CVApr 30, 2019

Curvature: A signature for Action Recognition in Video Sequences

He Chen, Gregory S. Chirikjian

In this paper, a novel signature of human action recognition, namely the curvature of a video sequence, is introduced. In this way, the distribution of sequential data is modeled, which enables few-shot learning. Instead of depending on recognizing features within images, our algorithm views actions as sequences on the universal time scale across a whole sequence of images. The video sequence, viewed as a curve in pixel space, is aligned by reparameterization using the arclength of the curve in pixel space. Once such curvatures are obtained, statistical indexes are extracted and fed into a learning-based classifier. Overall, our method is simple but powerful. Preliminary experimental results show that our method is effective and achieves state-of-the-art performance in video-based human action recognition. Moreover, we see latent capacity in transferring this idea into other sequence-based recognition applications such as speech recognition, machine translation, and text generation.

0.9CVMar 21, 2019

Quotienting Impertinent Camera Kinematics for 3D Video Stabilization

Thomas W. Mitchel, Christian Wuelker, Jin Seob Kim et al.

With the recent advent of methods that allow for real-time computation, dense 3D flows have become a viable basis for fast camera motion estimation. Most importantly, dense flows are more robust than the sparse feature matching techniques used by existing 3D stabilization methods, able to better handle large camera displacements and occlusions similar to those often found in consumer videos. Here we introduce a framework for 3D video stabilization that relies on dense scene flow alone. The foundation of this approach is a novel camera motion model that allows for real-world camera poses to be recovered directly from 3D motion fields. Moreover, this model can be extended to describe certain types of non-rigid artifacts that are commonly found in videos, such as those resulting from zooms. This framework gives rise to several robust regimes that produce high-quality stabilization of the kind achieved by prior full 3D methods while avoiding the fragility typically present in feature-based approaches. As an added benefit, our framework is fast: the simplicity of our motion model and efficient flow calculations combine to enable stabilization at a high frame rate.

2.3QMMar 5, 2019

An Efficient Production Process for Extracting Salivary Glands from Mosquitoes

Mariah Schrum, Amanda Canezin, Sumana Chakravarty et al.

Malaria is the one of the leading causes of morbidity and mortality in many developing countries. The development of a highly effective and readily deployable vaccine represents a major goal for world health. There has been recent progress in developing a clinically effective vaccine manufactured using Plasmodium falciparum sporozoites (PfSPZ) extracted from the salivary glands of Anopheles sp. Mosquitoes. The harvesting of PfSPZ requires dissection of the mosquito and manual removal of the salivary glands from each mosquito by trained technicians. While PfSPZ-based vaccines have shown highly promising results, the process of dissection of salivary glands is tedious and labor intensive. We propose a mechanical device that will greatly increase the rate of mosquito dissection and deskill the process to make malaria vaccines more affordable and more readily available. This device consists of several components: a sorting stage in which the mosquitoes are sorted into slots, a cutting stage in which the heads are removed, and a squeezing stage in which the salivary glands are extracted and collected. This method allows mosquitoes to be dissected twenty at a time instead of one by one as previously done and significantly reduces the dissection time per mosquito.

1.7CVJul 18, 2018

Signal Alignment for Humanoid Skeletons via the Globally Optimal Reparameterization Algorithm

Thomas W. Mitchel, Sipu Ruan, Gregory S. Chirikjian

The general ability to analyze and classify the 3D kinematics of the human form is an essential step in the development of socially adept humanoid robots. A variety of different types of signals can be used by machines to represent and characterize actions such as RGB videos, infrared maps, and optical flow. In particular, skeleton sequences provide a natural 3D kinematic description of human motions and can be acquired in real time using RGB+D cameras. Moreover, skeleton sequences are generalizable to characterize the motions of both humans and humanoid robots. The Globally Optimal Reparameterization Algorithm (GORA) is a novel, recently proposed algorithm for signal alignment in which signals are reparameterized to a globally optimal universal standard timescale (UST). Here, we introduce a variant of GORA for humanoid action recognition with skeleton sequences, which we call GORA-S. We briefly review the algorithm's mathematical foundations and contextualize them in the problem of action recognition with skeleton sequences. Subsequently, we introduce GORA-S and discuss parameters and numerical techniques for its effective implementation. We then compare its performance with that of the DTW and FastDTW algorithms, in terms of computational efficiency and accuracy in matching skeletons. Our results show that GORA-S attains a complexity that is significantly less than that of any tested DTW method. In addition, it displays a favorable balance between speed and accuracy that remains invariant under changes in skeleton sampling frequency, lending it a degree of versatility that could make it well-suited for a variety of action recognition tasks.

1.7CVJul 15, 2018

The Globally Optimal Reparameterization Algorithm: an Alternative to Fast Dynamic Time Warping for Action Recognition in Video Sequences

Thomas Mitchel, Sipu Ruan, Yixin Gao et al.

Signal alignment has become a popular problem in robotics due in part to its fundamental role in action recognition. Currently, the most successful algorithms for signal alignment are Dynamic Time Warping (DTW) and its variant 'Fast' Dynamic Time Warping (FastDTW). Here we introduce a new framework for signal alignment, namely the Globally Optimal Reparameterization Algorithm (GORA). We review the algorithm's mathematical foundation and provide a numerical verification of its theoretical basis. We compare the performance of GORA with that of the DTW and FastDTW algorithms, in terms of computational efficiency and accuracy in matching signals. Our results show a significant improvement in both speed and accuracy over the DTW and FastDTW algorithms and suggest that GORA has the potential to provide a highly effective framework for signal alignment and action recognition.

2.1ROOct 8, 2016

Proceedings of the 1st International Workshop on Robot Learning and Planning (RLP 2016)

Nancy Amato, Charles Anderson, Gregory Chirikjian et al.

Proceedings of the 1st International Workshop on Robot Learning and Planning (RLP 2016)