Feng Jin

LG
h-index11
8papers
527citations
Novelty32%
AI Score42

8 Papers

99.0DCMay 13
MultiPath Memory Access: Breaking Host-GPU Bandwidth Bottlenecks in LLM Services

Lingfeng Tang, Daoping Zhang, Junjie Chen et al.

Host-GPU data movement has become a latency-critical bottleneck in LLM serving, surfacing in common paths such as model-weight movement and KV cache offload/fetch. Today, each host-GPU copy is effectively confined to the PCIe path of the target GPU, even though modern multi-GPU servers contain additional PCIe links on peer GPUs and high bandwidth GPU interconnects. This leaves substantial intra-server I/O capacity unused. To address this issue, we present Multipath Memory Access (MMA), a software-defined multipath memory access system for host--GPU data transfer. To the best of our knowledge, MMA is the first software-defined system to enable efficient multipath host--GPU data transfer within a single multi-GPU server. MMA expands a single host--GPU copy across available direct and relay paths without hardware, driver, or application changes. It preserves CUDA stream semantics with a dependency-preserving Dummy Task, coordinates distributed micro-transfer completion through a lightweight synchronization mechanism, and uses queue backpressure to route traffic without explicit link-state feedback. On an 8-GPU NVIDIA H20 server, MMA achieves 245 GB/s peak host-to-GPU bandwidth, a 4.62x improvement over native CUDA copies, and reduces TTFT for KV cache fetching by 1.14-2.38x and model wake-up/switching latency by 1.12-2.48x.

ROJun 29, 2025
Benchmarking Generalizable Bimanual Manipulation: RoboTwin Dual-Arm Collaboration Challenge at CVPR 2025 MEIS Workshop

Tianxing Chen, Kaixuan Wang, Zhaohui Yang et al.

Embodied Artificial Intelligence (Embodied AI) is an emerging frontier in robotics, driven by the need for autonomous systems that can perceive, reason, and act in complex physical environments. While single-arm systems have shown strong task performance, collaborative dual-arm systems are essential for handling more intricate tasks involving rigid, deformable, and tactile-sensitive objects. To advance this goal, we launched the RoboTwin Dual-Arm Collaboration Challenge at the 2nd MEIS Workshop, CVPR 2025. Built on the RoboTwin Simulation platform (1.0 and 2.0) and the AgileX COBOT-Magic Robot platform, the competition consisted of three stages: Simulation Round 1, Simulation Round 2, and a final Real-World Round. Participants totally tackled 17 dual-arm manipulation tasks, covering rigid, deformable, and tactile-based scenarios. The challenge attracted 64 global teams and over 400 participants, producing top-performing solutions like SEM and AnchorDP3 and generating valuable insights into generalizable bimanual policy learning. This report outlines the competition setup, task design, evaluation methodology, key findings and future direction, aiming to support future research on robust and generalizable bimanual manipulation policies. The Challenge Webpage is available at https://robotwin-benchmark.github.io/cvpr-2025-challenge/.

LGMar 5, 2020
mmFall: Fall Detection using 4D MmWave Radar and a Hybrid Variational RNN AutoEncoder

Feng Jin, Arindam Sengupta, Siyang Cao

In this paper we propose mmFall - a novel fall detection system, which comprises of (i) the emerging millimeter-wave (mmWave) radar sensor to collect the human body's point cloud along with the body centroid, and (ii) a variational recurrent autoencoder (VRAE) to compute the anomaly level of the body motion based on the acquired point cloud. A fall is claimed to have occurred when the spike in anomaly level and the drop in centroid height occur simultaneously. The mmWave radar sensor provides several advantages, such as privacycompliance and high-sensitivity to motion, over the traditional sensing modalities. However, (i) randomness in radar point cloud data and (ii) difficulties in fall collection/labeling in the traditional supervised fall detection approaches are the two main challenges. To overcome the randomness in radar data, the proposed VRAE uses variational inference, a probabilistic approach rather than the traditional deterministic approach, to infer the posterior probability of the body's latent motion state at each frame, followed by a recurrent neural network (RNN) to learn the temporal features of the motion over multiple frames. Moreover, to circumvent the difficulties in fall data collection/labeling, the VRAE is built upon an autoencoder architecture in a semi-supervised approach, and trained on only normal activities of daily living (ADL) such that in the inference stage the VRAE will generate a spike in the anomaly level once an abnormal motion, such as fall, occurs. During the experiment, we implemented the VRAE along with two other baselines, and tested on the dataset collected in an apartment. The receiver operating characteristic (ROC) curve indicates that our proposed model outperforms the other two baselines, and achieves 98% detection out of 50 falls at the expense of just 2 false alarms.

SPNov 21, 2019
mm-Pose: Real-Time Human Skeletal Posture Estimation using mmWave Radars and CNNs

Arindam Sengupta, Feng Jin, Renyuan Zhang et al.

In this paper, mm-Pose, a novel approach to detect and track human skeletons in real-time using an mmWave radar, is proposed. To the best of the authors' knowledge, this is the first method to detect >15 distinct skeletal joints using mmWave radar reflection signals. The proposed method would find several applications in traffic monitoring systems, autonomous vehicles, patient monitoring systems and defense forces to detect and track human skeleton for effective and preventive decision making in real-time. The use of radar makes the system operationally robust to scene lighting and adverse weather conditions. The reflected radar point cloud in range, azimuth and elevation are first resolved and projected in Range-Azimuth and Range-Elevation planes. A novel low-size high-resolution radar-to-image representation is also presented, that overcomes the sparsity in traditional point cloud data and offers significant reduction in the subsequent machine learning architecture. The RGB channels were assigned with the normalized values of range, elevation/azimuth and the power level of the reflection signals for each of the points. A forked CNN architecture was used to predict the real-world position of the skeletal joints in 3-D space, using the radar-to-image representation. The proposed method was tested for a single human scenario for four primary motions, (i) Walking, (ii) Swinging left arm, (iii) Swinging right arm, and (iv) Swinging both arms to validate accurate predictions for motion in range, azimuth and elevation. The detailed methodology, implementation, challenges, and validation results are presented.

SPNov 14, 2019
MmWave Radar Point Cloud Segmentation using GMM in Multimodal Traffic Monitoring

Feng Jin, Arindam Sengupta, Siyang Cao et al.

In multimodal traffic monitoring, we gather traffic statistics for distinct transportation modes, such as pedestrians, cars and bicycles, in order to analyze and improve people's daily mobility in terms of safety and convenience. On account of its robustness to bad light and adverse weather conditions, and inherent speed measurement ability, the radar sensor is a suitable option for this application. However, the sparse radar data from conventional commercial radars make it extremely challenging for transportation mode classification. Thus, we propose to use a high-resolution millimeter-wave(mmWave) radar sensor to obtain a relatively richer radar point cloud representation for a traffic monitoring scenario. Based on a new feature vector, we use the multivariate Gaussian mixture model (GMM) to do the radar point cloud segmentation, i.e. `point-wise' classification, in an unsupervised learning environment. In our experiment, we collected radar point clouds for pedestrians and cars, which also contained the inevitable clutter from the surroundings. The experimental results using GMM on the new feature vector demonstrated a good segmentation performance in terms of the intersection-over-union (IoU) metrics. The detailed methodology and validation metrics are presented and discussed.

SPNov 14, 2019
Multiple Patients Behavior Detection in Real-time using mmWave Radar and Deep CNNs

Feng Jin, Renyuan Zhang, Arindam Sengupta et al.

To address potential gaps noted in patient monitoring in the hospital, a novel patient behavior detection system using mmWave radar and deep convolution neural network (CNN), which supports the simultaneous recognition of multiple patients' behaviors in real-time, is proposed. In this study, we use an mmWave radar to track multiple patients and detect the scattering point cloud of each one. For each patient, the Doppler pattern of the point cloud over a time period is collected as the behavior signature. A three-layer CNN model is created to classify the behavior for each patient. The tracking and point clouds detection algorithm was also implemented on an mmWave radar hardware platform with an embedded graphics processing unit (GPU) board to collect Doppler pattern and run the CNN model. A training dataset of six types of behavior were collected, over a long duration, to train the model using Adam optimizer with an objective to minimize cross-entropy loss function. Lastly, the system was tested for real-time operation and obtained a very good inference accuracy when predicting each patient's behavior in a two-patient scenario.

LGOct 17, 2018
Deep Diabetologist: Learning to Prescribe Hyperglycemia Medications with Hierarchical Recurrent Neural Networks

Jing Mei, Shiwan Zhao, Feng Jin et al.

In healthcare, applying deep learning models to electronic health records (EHRs) has drawn considerable attention. EHR data consist of a sequence of medical visits, i.e. a multivariate time series of diagnosis, medications, physical examinations, lab tests, etc. This sequential nature makes EHR well matching the power of Recurrent Neural Network (RNN). In this paper, we propose "Deep Diabetologist" - using RNNs for EHR sequential data modelling, to provide the personalized hyperglycemia medication prediction for diabetic patients. Particularly, we develop a hierarchical RNN to capture the heterogeneous sequential information in the EHR data. Our experimental results demonstrate the improved performance, compared with a baseline classifier using logistic regression. Moreover, hierarchical RNN models outperform basic ones, providing deeper data insights for clinical decision support.

LGDec 24, 2017
Neural Network Multitask Learning for Traffic Flow Forecasting

Feng Jin, Shiliang Sun

Traditional neural network approaches for traffic flow forecasting are usually single task learning (STL) models, which do not take advantage of the information provided by related tasks. In contrast to STL, multitask learning (MTL) has the potential to improve generalization by transferring information in training signals of extra tasks. In this paper, MTL based neural networks are used for traffic flow forecasting. For neural network MTL, a backpropagation (BP) network is constructed by incorporating traffic flows at several contiguous time instants into an output layer. Nodes in the output layer can be seen as outputs of different but closely related STL tasks. Comprehensive experiments on urban vehicular traffic flow data and comparisons with STL show that MTL in BP neural networks is a promising and effective approach for traffic flow forecasting.