AIAug 14, 2025
STEP: Stepwise Curriculum Learning for Context-Knowledge Fusion in Conversational RecommendationZhenye Yang, Jinpeng Chen, Huan Li et al.
Conversational recommender systems (CRSs) aim to proactively capture user preferences through natural language dialogue and recommend high-quality items. To achieve this, CRS gathers user preferences via a dialog module and builds user profiles through a recommendation module to generate appropriate recommendations. However, existing CRS faces challenges in capturing the deep semantics of user preferences and dialogue context. In particular, the efficient integration of external knowledge graph (KG) information into dialogue generation and recommendation remains a pressing issue. Traditional approaches typically combine KG information directly with dialogue content, which often struggles with complex semantic relationships, resulting in recommendations that may not align with user expectations. To address these challenges, we introduce STEP, a conversational recommender centered on pre-trained language models that combines curriculum-guided context-knowledge fusion with lightweight task-specific prompt tuning. At its heart, an F-Former progressively aligns the dialogue context with knowledge-graph entities through a three-stage curriculum, thus resolving fine-grained semantic mismatches. The fused representation is then injected into the frozen language model via two minimal yet adaptive prefix prompts: a conversation prefix that steers response generation toward user intent and a recommendation prefix that biases item ranking toward knowledge-consistent candidates. This dual-prompt scheme allows the model to share cross-task semantics while respecting the distinct objectives of dialogue and recommendation. Experimental results show that STEP outperforms mainstream methods in the precision of recommendation and dialogue quality in two public datasets.
RODec 4, 2023
Integrated Drill Boom Hole-Seeking Control via Reinforcement LearningHaoqi Yan, Haoyuan Xu, Hongbo Gao et al.
Intelligent drill boom hole-seeking is a promising technology for enhancing drilling efficiency, mitigating potential safety hazards, and relieving human operators. Most existing intelligent drill boom control methods rely on a hierarchical control framework based on inverse kinematics. However, these methods are generally time-consuming due to the computational complexity of inverse kinematics and the inefficiency of the sequential execution of multiple joints. To tackle these challenges, this study proposes an integrated drill boom control method based on Reinforcement Learning (RL). We develop an integrated drill boom control framework that utilizes a parameterized policy to directly generate control inputs for all joints at each time step, taking advantage of joint posture and target hole information. By formulating the hole-seeking task as a Markov decision process, contemporary mainstream RL algorithms can be directly employed to learn a hole-seeking policy, thus eliminating the need for inverse kinematics solutions and promoting cooperative multi-joint control. To enhance the drilling accuracy throughout the entire drilling process, we devise a state representation that combines Denavit-Hartenberg joint information and preview hole-seeking discrepancy data. Simulation results show that the proposed method significantly outperforms traditional methods in terms of hole-seeking accuracy and time efficiency.
ROJul 23, 2020
Receding Horizon Control Based Online Motion Planning with Partially Infeasible LTL SpecificationsMingyu Cai, Hao Peng, Zhijun Li et al.
This work considers online optimal motion planning of an autonomous agent subject to linear temporal logic (LTL) constraints. The environment is dynamic in the sense of containing mobile obstacles and time-varying areas of interest (i.e., time-varying reward and workspace properties) to be visited by the agent. Since user-specified tasks may not be fully realized (i.e., partially infeasible), this work considers hard and soft LTL constraints, where hard constraints enforce safety requirement (e.g. avoid obstacles) while soft constraints represent tasks that can be relaxed to not strictly follow user specifications. The motion planning of the agent is to generate policies, in decreasing order of priority, to 1) formally guarantee the satisfaction of safety constraints; 2) mostly satisfy soft constraints (i.e., minimize the violation cost if desired tasks are partially infeasible); and 3) optimize the objective of rewards collection (i.e., visiting dynamic areas of more interests). To achieve these objectives, a relaxed product automaton, which allows the agent to not strictly follow the desired LTL constraints, is constructed. A utility function is developed to quantify the differences between the revised and the desired motion plan, and the accumulated rewards are designed to bias the motion plan towards those areas of more interests. Receding horizon control is synthesized with an LTL formula to maximize the accumulated utilities over a finite horizon, while ensuring that safety constraints are fully satisfied and soft constraints are mostly satisfied. Simulation and experiment results are provided to demonstrate the effectiveness of the developed motion strategy.
HCOct 18, 2018
The Effects of Using Taxi-Hailing Application on Driving PerformanceXiexing Feng, Libo Cao, Yunxian Zhang et al.
Driver distraction has become a major threat to the road safety, and the globally booming taxi-hailing application introduces new source of distraction to drivers. Although various in-vehicle information systems (IVIS) have been studied extensively, no documentation exists objectively measuring the extent to which interacting with taxi-hailing application during driving impacts drivers' behavior. To fill this gap, a simulator-based study was conducted to synthetically compare the effects that different output modalities (visual, audio, combined visual-audio) and input modalities (baseline, manual, speech) imposed on the driving performance. The results show that the visual output introduced more negative effects on driving performance compared to audio output. In the combined output, visual component dominated the effects imposed on the longitudinal control and hazard detection; audio component only exacerbated the negative effects of visual component on the lateral control. Speech input modality was overall less detrimental to driving performance than manual input modality, especially reflected in the drivers' quicker reaction to hazard events. The visual-manual interaction modality most severely impaired the hazard detecting ability, while also led to strong compensative behaviors. The audio-speech and visual-speech modality associated with more smooth lateral control and faster response to hazard events respectively compared to other modality. These results could be applied to improve the design of not only the taxi-hailing application, but also other input-output balanced IVIS.