Yong Gu

CL
h-index5
3papers
20citations
Novelty35%
AI Score28

3 Papers

LGJul 8, 2024Code
Generalizing soft actor-critic algorithms to discrete action spaces

Le Zhang, Yong Gu, Xin Zhao et al.

ATARI is a suite of video games used by reinforcement learning (RL) researchers to test the effectiveness of the learning algorithm. Receiving only the raw pixels and the game score, the agent learns to develop sophisticated strategies, even to the comparable level of a professional human games tester. Ideally, we also want an agent requiring very few interactions with the environment. Previous competitive model-free algorithms for the task use the valued-based Rainbow algorithm without any policy head. In this paper, we change it by proposing a practical discrete variant of the soft actor-critic (SAC) algorithm. The new variant enables off-policy learning using policy heads for discrete domains. By incorporating it into the advanced Rainbow variant, i.e., the ``bigger, better, faster'' (BBF), the resulting SAC-BBF improves the previous state-of-the-art interquartile mean (IQM) from 1.045 to 1.088, and it achieves these results using only replay ratio (RR) 2. By using lower RR 2, the training time of SAC-BBF is strictly one-third of the time required for BBF to achieve an IQM of 1.045 using RR 8. As a value of IQM greater than one indicates super-human performance, SAC-BBF is also the only model-free algorithm with a super-human level using only RR 2. The code is publicly available on GitHub at https://github.com/lezhang-thu/bigger-better-faster-SAC.

CLApr 15, 2025
Streamlining Biomedical Research with Specialized LLMs

Linqing Chen, Weilei Wang, Yubin Xia et al.

In this paper, we propose a novel system that integrates state-of-the-art, domain-specific large language models with advanced information retrieval techniques to deliver comprehensive and context-aware responses. Our approach facilitates seamless interaction among diverse components, enabling cross-validation of outputs to produce accurate, high-quality responses enriched with relevant data, images, tables, and other modalities. We demonstrate the system's capability to enhance response precision by leveraging a robust question-answering model, significantly improving the quality of dialogue generation. The system provides an accessible platform for real-time, high-fidelity interactions, allowing users to benefit from efficient human-computer interaction, precise retrieval, and simultaneous access to a wide range of literature and data. This dramatically improves the research efficiency of professionals in the biomedical and pharmaceutical domains and facilitates faster, more informed decision-making throughout the R\&D process. Furthermore, the system proposed in this paper is available at https://synapse-chat.patsnap.com.

ROMay 20, 2020
Collision-free Trajectory Planning for Autonomous Surface Vehicle

Licheng Wen, Jiaqing Yan, Xuemeng Yang et al.

In this paper, we propose an efficient and accurate method for autonomous surface vehicles to generate a smooth and collision-free trajectory considering its dynamics constraints. We decouple the trajectory planning problem as a front-end feasible path searching and a back-end kinodynamic trajectory optimization. Firstly, we model the type of two-thrusts under-actuated surface vessel. Then we adopt a sampling-based path searching to find an asymptotic optimal path through the obstacle-surrounding environment and extract several waypoints from it. We apply a numerical optimization method in the back-end to generate the trajectory. From the perspective of security in the field voyage, we propose the sailing corridor method to guarantee the trajectory away from obstacles. Moreover, considering limited fuel ASV carrying, we design a numerical objective function which can optimize a fuel-saving trajectory. Finally, we validate and compare the proposed method in simulation environments and the results fit our expected trajectory.