Jingtai Liu

h-index3

4papers

20citations

Novelty52%

AI Score37

Ranked #112,687 of 201,326 authors (top 56%)#36,317 in CV (top 62%)

4 Papers

CVSep 22, 2022Code

MGTR: End-to-End Mutual Gaze Detection with Transformer

Hang Guo, Zhengxi Hu, Jingtai Liu

People's looking at each other or mutual gaze is ubiquitous in our daily interactions, and detecting mutual gaze is of great significance for understanding human social scenes. Current mutual gaze detection methods focus on two-stage methods, whose inference speed is limited by the two-stage pipeline and the performance in the second stage is affected by the first one. In this paper, we propose a novel one-stage mutual gaze detection framework called Mutual Gaze TRansformer or MGTR to perform mutual gaze detection in an end-to-end manner. By designing mutual gaze instance triples, MGTR can detect each human head bounding box and simultaneously infer mutual gaze relationship based on global image information, which streamlines the whole process with simplicity. Experimental results on two mutual gaze datasets show that our method is able to accelerate mutual gaze detection process without losing performance. Ablation study shows that different components of MGTR can capture different levels of semantic information in images. Code is available at https://github.com/Gmbition/MGTR

SYOct 30, 2016

Impedance control of a cable-driven series elastic actuator with the 2-DOF control structure

Wulin Zou, Zhuo Yang, Wen Tan et al.

Series elastic actuators (SEAs) are growingly important in physical human-robot interaction (HRI) due to their inherent safety and compliance. Cable-driven SEAs also allow flexible installation and remote torque transmission, etc. However, there are still challenges for the impedance control of cable-driven SEAs, such as the reduced bandwidth caused by the elastic component, and the performance balance between reference tracking and robustness. In this paper, a velocity sourced cable-driven SEA has been set up. Then, a stabilizing 2 degrees of freedom (2-DOF) control approach was designed to separately pursue the goals of robustness and torque tracking. Further, the impedance control structure for human-robot interaction was designed and implemented with a torque compensator. Both simulation and practical experiments have validated the efficacy of the 2-DOF method for the control of cable-driven SEAs.

CVJul 9, 2025Code

MK-Pose: Category-Level Object Pose Estimation via Multimodal-Based Keypoint Learning

Yifan Yang, Peili Song, Enfan Lan et al.

Category-level object pose estimation, which predicts the pose of objects within a known category without prior knowledge of individual instances, is essential in applications like warehouse automation and manufacturing. Existing methods relying on RGB images or point cloud data often struggle with object occlusion and generalization across different instances and categories. This paper proposes a multimodal-based keypoint learning framework (MK-Pose) that integrates RGB images, point clouds, and category-level textual descriptions. The model uses a self-supervised keypoint detection module enhanced with attention-based query generation, soft heatmap matching and graph-based relational modeling. Additionally, a graph-enhanced feature fusion module is designed to integrate local geometric information and global context. MK-Pose is evaluated on CAMERA25 and REAL275 dataset, and is further tested for cross-dataset capability on HouseCat6D dataset. The results demonstrate that MK-Pose outperforms existing state-of-the-art methods in both IoU and average precision without shape priors. Codes will be released at \href{https://github.com/yangyifanYYF/MK-Pose}{https://github.com/yangyifanYYF/MK-Pose}.

CVMar 11, 2025

Simulating Automotive Radar with Lidar and Camera Inputs

Peili Song, Dezhen Song, Yifan Yang et al.

Low-cost millimeter automotive radar has received more and more attention due to its ability to handle adverse weather and lighting conditions in autonomous driving. However, the lack of quality datasets hinders research and development. We report a new method that is able to simulate 4D millimeter wave radar signals including pitch, yaw, range, and Doppler velocity along with radar signal strength (RSS) using camera image, light detection and ranging (lidar) point cloud, and ego-velocity. The method is based on two new neural networks: 1) DIS-Net, which estimates the spatial distribution and number of radar signals, and 2) RSS-Net, which predicts the RSS of the signal based on appearance and geometric information. We have implemented and tested our method using open datasets from 3 different models of commercial automotive radar. The experimental results show that our method can successfully generate high-fidelity radar signals. Moreover, we have trained a popular object detection neural network with data augmented by our synthesized radar. The network outperforms the counterpart trained only on raw radar data, a promising result to facilitate future radar-based research and development.