ROJul 3, 2025
MISCGrasp: Leveraging Multiple Integrated Scales and Contrastive Learning for Enhanced Volumetric GraspingQingyu Fan, Yinghao Cai, Chao Li et al.
Robotic grasping faces challenges in adapting to objects with varying shapes and sizes. In this paper, we introduce MISCGrasp, a volumetric grasping method that integrates multi-scale feature extraction with contrastive feature enhancement for self-adaptive grasping. We propose a query-based interaction between high-level and low-level features through the Insight Transformer, while the Empower Transformer selectively attends to the highest-level features, which synergistically strikes a balance between focusing on fine geometric details and overall geometric structures. Furthermore, MISCGrasp utilizes multi-scale contrastive learning to exploit similarities among positive grasp samples, ensuring consistency across multi-scale features. Extensive experiments in both simulated and real-world environments demonstrate that MISCGrasp outperforms baseline and variant methods in tabletop decluttering tasks. More details are available at https://miscgrasp.github.io/.
ROMar 5, 2025
NeuGrasp: Generalizable Neural Surface Reconstruction with Background Priors for Material-Agnostic Object Grasp DetectionQingyu Fan, Yinghao Cai, Chao Li et al.
Robotic grasping in scenes with transparent and specular objects presents great challenges for methods relying on accurate depth information. In this paper, we introduce NeuGrasp, a neural surface reconstruction method that leverages background priors for material-agnostic grasp detection. NeuGrasp integrates transformers and global prior volumes to aggregate multi-view features with spatial encoding, enabling robust surface reconstruction in narrow and sparse viewing conditions. By focusing on foreground objects through residual feature enhancement and refining spatial perception with an occupancy-prior volume, NeuGrasp excels in handling objects with transparent and specular surfaces. Extensive experiments in both simulated and real-world scenarios show that NeuGrasp outperforms state-of-the-art methods in grasping while maintaining comparable reconstruction quality. More details are available at https://neugrasp.github.io/.
FLU-DYNJan 21, 2024
Asynchronous Parallel Reinforcement Learning for Optimizing Propulsive Performance in Fin Ray ControlXin-Yang Liu, Dariush Bodaghi, Qian Xue et al.
Fish fin rays constitute a sophisticated control system for ray-finned fish, facilitating versatile locomotion within complex fluid environments. Despite extensive research on the kinematics and hydrodynamics of fish locomotion, the intricate control strategies in fin-ray actuation remain largely unexplored. While deep reinforcement learning (DRL) has demonstrated potential in managing complex nonlinear dynamics; its trial-and-error nature limits its application to problems involving computationally demanding environmental interactions. This study introduces a cutting-edge off-policy DRL algorithm, interacting with a fluid-structure interaction (FSI) environment to acquire intricate fin-ray control strategies tailored for various propulsive performance objectives. To enhance training efficiency and enable scalable parallelism, an innovative asynchronous parallel training (APT) strategy is proposed, which fully decouples FSI environment interactions and policy/value network optimization. The results demonstrated the success of the proposed method in discovering optimal complex policies for fin-ray actuation control, resulting in a superior propulsive performance compared to the optimal sinusoidal actuation function identified through a parametric grid search. The merit and effectiveness of the APT approach are also showcased through comprehensive comparison with conventional DRL training strategies in numerical experiments of controlling nonlinear dynamics.
IRFeb 28, 2020
Jointly Learning to Recommend and AdvertiseXiangyu Zhao, Xudong Zheng, Xiwang Yang et al.
Online recommendation and advertising are two major income channels for online recommendation platforms (e.g. e-commerce and news feed site). However, most platforms optimize recommending and advertising strategies by different teams separately via different techniques, which may lead to suboptimal overall performances. To this end, in this paper, we propose a novel two-level reinforcement learning framework to jointly optimize the recommending and advertising strategies, where the first level generates a list of recommendations to optimize user experience in the long run; then the second level inserts ads into the recommendation list that can balance the immediate advertising revenue from advertisers and the negative influence of ads on long-term user experience. To be specific, the first level tackles high combinatorial action space problem that selects a subset items from the large item space; while the second level determines three internally related tasks, i.e., (i) whether to insert an ad, and if yes, (ii) the optimal ad and (iii) the optimal location to insert. The experimental results based on real-world data demonstrate the effectiveness of the proposed framework. We have released the implementation code to ease reproductivity.
IRFeb 26, 2020
AutoEmb: Automated Embedding Dimensionality Search in Streaming RecommendationsXiangyu Zhao, Chong Wang, Ming Chen et al.
Deep learning based recommender systems (DLRSs) often have embedding layers, which are utilized to lessen the dimensionality of categorical variables (e.g. user/item identifiers) and meaningfully transform them in the low-dimensional space. The majority of existing DLRSs empirically pre-define a fixed and unified dimension for all user/item embeddings. It is evident from recent researches that different embedding sizes are highly desired for different users/items according to their popularity. However, manually selecting embedding sizes in recommender systems can be very challenging due to the large number of users/items and the dynamic nature of their popularity. Thus, in this paper, we propose an AutoML based end-to-end framework (AutoEmb), which can enable various embedding dimensions according to the popularity in an automated and dynamic manner. To be specific, we first enhance a typical DLRS to allow various embedding dimensions; then we propose an end-to-end differentiable framework that can automatically select different embedding dimensions according to user/item popularity; finally we propose an AutoML based optimization algorithm in a streaming recommendation setting. The experimental results based on widely used benchmark datasets demonstrate the effectiveness of the AutoEmb framework.