CL AIJul 21, 2025

A Novel Self-Evolution Framework for Large Language Models

arXiv:2507.15281v1h-index: 3

Originality Incremental advance

AI Analysis

This work addresses the problem of enhancing LLMs' domain cognition and user alignment for AI researchers and practitioners, representing an incremental improvement over existing post-training methods.

The paper tackles the limitation of existing post-training strategies for Large Language Models (LLMs) by proposing a Dual-Phase Self-Evolution (DPSE) framework that jointly optimizes user preference adaptation and domain-specific competence, resulting in consistent outperformance over baselines like Supervised Fine-Tuning and Preference Optimization across general NLP benchmarks and long-term dialogue tasks.

The capabilities of Large Language Models (LLMs) are limited to some extent by pre-training, so some researchers optimize LLMs through post-training. Existing post-training strategies, such as memory-based retrieval or preference optimization, improve user alignment yet fail to enhance the model's domain cognition. To bridge this gap, we propose a novel Dual-Phase Self-Evolution (DPSE) framework that jointly optimizes user preference adaptation and domain-specific competence. DPSE introduces a Censor module to extract multi-dimensional interaction signals and estimate satisfaction scores, which guide structured data expansion via topic-aware and preference-driven strategies. These expanded datasets support a two-stage fine-tuning pipeline: supervised domain grounding followed by frequency-aware preference optimization. Experiments across general NLP benchmarks and long-term dialogue tasks demonstrate that DPSE consistently outperforms Supervised Fine-Tuning, Preference Optimization, and Memory-Augmented baselines. Ablation studies validate the contribution of each module. In this way, our framework provides an autonomous path toward continual self-evolution of LLMs.

View on arXiv PDF

Similar