Donghai Li

CV
4papers
55citations
Novelty42%
AI Score25

4 Papers

CVMar 28, 2022
Uni6D: A Unified CNN Framework without Projection Breakdown for 6D Pose Estimation

Xiaoke Jiang, Donghai Li, Hao Chen et al.

As RGB-D sensors become more affordable, using RGB-D images to obtain high-accuracy 6D pose estimation results becomes a better option. State-of-the-art approaches typically use different backbones to extract features for RGB and depth images. They use a 2D CNN for RGB images and a per-pixel point cloud network for depth data, as well as a fusion network for feature fusion. We find that the essential reason for using two independent backbones is the "projection breakdown" problem. In the depth image plane, the projected 3D structure of the physical world is preserved by the 1D depth value and its built-in 2D pixel coordinate (UV). Any spatial transformation that modifies UV, such as resize, flip, crop, or pooling operations in the CNN pipeline, breaks the binding between the pixel value and UV coordinate. As a consequence, the 3D structure is no longer preserved by a modified depth image or feature. To address this issue, we propose a simple yet effective method denoted as Uni6D that explicitly takes the extra UV data along with RGB-D images as input. Our method has a Unified CNN framework for 6D pose estimation with a single CNN backbone. In particular, the architecture of our method is based on Mask R-CNN with two extra heads, one named RT head for directly predicting 6D pose and the other named abc head for guiding the network to map the visible points to their coordinates in the 3D model as an auxiliary module. This end-to-end approach balances simplicity and accuracy, achieving comparable accuracy with state of the arts and 7.2x faster inference speed on the YCB-Video dataset.

CVOct 20, 2022
Geo6D: Geometric Constraints Learning for 6D Pose Estimation

Jianqiu Chen, Mingshan Sun, Ye Zheng et al.

Numerous 6D pose estimation methods have been proposed that employ end-to-end regression to directly estimate the target pose parameters. Since the visible features of objects are implicitly influenced by their poses, the network allows inferring the pose by analyzing the differences in features in the visible region. However, due to the unpredictable and unrestricted range of pose variations, the implicitly learned visible feature-pose constraints are insufficiently covered by the training samples, making the network vulnerable to unseen object poses. To tackle these challenges, we proposed a novel geometric constraints learning approach called Geo6D for direct regression 6D pose estimation methods. It introduces a pose transformation formula expressed in relative offset representation, which is leveraged as geometric constraints to reconstruct the input and output targets of the network. These reconstructed data enable the network to estimate the pose based on explicit geometric constraints and relative offset representation mitigates the issue of the pose distribution gap. Extensive experimental results show that when equipped with Geo6D, the direct 6D methods achieve state-of-the-art performance on multiple datasets and demonstrate significant effectiveness, even with only 10% amount of data.

CLJun 20, 2024
Overview of the CAIL 2023 Argument Mining Track

Jingcong Liang, Junlong Wang, Xinyu Zhai et al.

We give a detailed overview of the CAIL 2023 Argument Mining Track, one of the Chinese AI and Law Challenge (CAIL) 2023 tracks. The main goal of the track is to identify and extract interacting argument pairs in trial dialogs. It mainly uses summarized judgment documents but can also refer to trial recordings. The track consists of two stages, and we introduce the tasks designed for each stage; we also extend the data from previous events into a new dataset -- CAIL2023-ArgMine -- with annotated new cases from various causes of action. We outline several submissions that achieve the best results, including their methods for different stages. While all submissions rely on language models, they have incorporated strategies that may benefit future work in this field.

SYApr 16, 2019
Fractional order [PI] Controller and Smith-like Predictor Design for A Class of High Order Systems

Zhenlong Wu, Jie Yuan, Yuquan Chen et al.

To handle the control difficulties caused by high-order dynamics, a control structure based on fractional order [proportional integral] (PI) controller and fractional order Smith-like predictor for a class of high order systems in the type of K/(Ts+1)n is proposed in this paper. The analysis of the tracking and disturbance rejection is illustrated based on the terminal value theorem and shows that the proposed control structure can ensure that the closed-loop system converges to the set point without static error and the closed-loop system recovers to its original state when the input disturbance occurs. Then, simulations about the influence on the control performance and control signal with different are carried out based on multi-objective genetic algorithm (MO-GA). The results show that the control performance can be improved and the energy of the control signal can be reduced simultaneously when the order is chosen no more than one. This can verify that the fractional order Smith-like predictor with has an advantage over that of the integral order Smith-like predictor.