Yanrong Hu

CV
h-index17
3papers
33citations
Novelty50%
AI Score42

3 Papers

ROSep 3, 2022
A Novel Knowledge-Based Genetic Algorithm for Robot Path Planning in Complex Environments

Yanrong Hu, Simon X. Yang

In this paper, a novel knowledge-based genetic algorithm for path planning of a mobile robot in unstructured complex environments is proposed, where five problem-specific operators are developed for efficient robot path planning. The proposed genetic algorithm incorporates the domain knowledge of robot path planning into its specialized operators, some of which also combine a local search technique. A unique and simple representation of the robot path is proposed and a simple but effective path evaluation method is developed, where the collisions can be accurately detected and the quality of a robot path is well reflected. The proposed algorithm is capable of finding a near-optimal robot path in both static and dynamic complex environments. The effectiveness and efficiency of the proposed algorithm are demonstrated by simulation studies. The irreplaceable role of the specialized genetic operators in the proposed genetic algorithm for solving the robot path planning problem is demonstrated through a comparison study.

IRMar 27Code
Towards Transfer-Efficient Multi-modal Sequential Recommendation with State Space Duality

Hao Fan, Qingyang Liu, Hongjiu Liu et al.

Sequential Recommendation (SR) models infer user preferences from interaction histories. While transferable Multi-modal SR models outperform traditional ID-based approaches, existing methods struggle with slow fine-tuning convergence due to complex optimization requirements and negative transfer effects. We propose MMM4Rec (Multi-Modal Mamba for Sequential Recommendation), a novel Multi-modal SR framework that incorporates a dedicated algebraic constraint mechanism for efficient transfer learning. By combining State Space Duality (SSD)'s temporal decay properties with a globally-aware temporal modeling design, our model dynamically prioritizes key modality information, overcoming limitations of Transformer-based approaches. The framework implements a constrained two-stage process: (1) sequence-level cross-modal alignment via shared projection matrices, followed by (2) temporal fusion using our newly designed Cross-SSD module and dual-channel Fourier adaptive filtering. This architecture maintains semantic consistency while suppressing noise propagation. MMM4Rec achieves rapid fine-tuning convergence with simple cross-entropy loss, significantly improving Multi-modal recommendation accuracy while maintaining strong transferability. Extensive experiments demonstrate MMM4Rec's state-of-the-art performance, achieving strong multi-modal retrieval capability and exhibiting 10x faster average convergence speed when transferring to large-scale downstream datasets. The implementation is available at https://github.com/AlwaysFHao/MMM4Rec .

CVNov 30, 2024
Continuous Concepts Removal in Text-to-image Diffusion Models

Tingxu Han, Weisong Sun, Yanrong Hu et al.

Text-to-image diffusion models have shown an impressive ability to generate high-quality images from input textual descriptions. However, concerns have been raised about the potential for these models to create content that infringes on copyrights or depicts disturbing subject matter. Removing specific concepts from these models is a promising potential solution to this problem. However, existing methods for concept removal do not work well in practical but challenging scenarios where concepts need to be continuously removed. Specifically, these methods lead to poor alignment between the text prompts and the generated image after the continuous removal process. To address this issue, we propose a novel approach called CCRT that includes a designed knowledge distillation paradigm. It constrains the text-image alignment behavior during the continuous concept removal process by using a set of text prompts generated through our genetic algorithm, which employs a designed fuzzing strategy. We conduct extensive experiments involving the removal of various concepts. The results evaluated through both algorithmic metrics and human studies demonstrate that our CCRT can effectively remove the targeted concepts in a continuous manner while maintaining the high generation quality (e.g., text-image alignment) of the model.