Yichuan Chen

CL
h-index1
3papers
1citation
Novelty52%
AI Score42

3 Papers

CLJun 20, 2025Code
From Thinking to Output: Chain-of-Thought and Text Generation Characteristics in Reasoning Language Models

Junhao Liu, Zhenhao Xu, Yuxin Fang et al.

Recently, there have been notable advancements in large language models (LLMs), demonstrating their growing abilities in complex reasoning. However, existing research largely overlooks a thorough and systematic comparison of these models' reasoning processes and outputs, particularly regarding their self-reflection pattern (also termed "Aha moment") and the interconnections across diverse domains. This paper proposes a novel framework for analyzing the reasoning characteristics of four cutting-edge large reasoning models (GPT-o1, DeepSeek-R1, Kimi-k1.5, and Grok-3) using keywords statistic and LLM-as-a-judge paradigm. Our approach connects their internal thinking processes with their final outputs. A diverse dataset consists of real-world scenario-based questions covering logical deduction, causal inference, and multi-step problem-solving. Additionally, a set of metrics is put forward to assess both the coherence of reasoning and the accuracy of the outputs. The research results uncover various patterns of how these models balance exploration and exploitation, deal with problems, and reach conclusions during the reasoning process. Through quantitative and qualitative comparisons, disparities among these models are identified in aspects such as the depth of reasoning, the reliance on intermediate steps, and the degree of similarity between their thinking processes and output patterns and those of GPT-o1. This work offers valuable insights into the trade-off between computational efficiency and reasoning robustness and provides practical recommendations for enhancing model design and evaluation in practical applications. We publicly release our project at: https://github.com/ChangWenhan/FromThinking2Output

CRMay 12
Safety Context Injection: Inference-Time Safety Alignment via Static Filtering and Agentic Analysis

Zhenhao Xu, Wenhan Chang, Yichuan Chen et al.

Large Reasoning Models (LRMs) improve performance on complex tasks, but they also make safety control harder at deployment time. In black-box settings, defenders cannot modify model weights and must instead intervene at inference time. This setting creates three practical challenges: harmful intent may be hidden by educational or role-play framing, deep safety analysis can introduce non-trivial latency, and long adversarial contexts can dilute the local cues that simpler filters rely on. These challenges can expose an apparent thinking--output gap, where the model appears cautious during reasoning but still produces an unsafe final answer. To address this problem, we propose Safety Context Injection (SCI), an inference-time framework that separates safety assessment from task generation and prepends a structured external risk report as injected safety context for the protected model. The framework is instantiated in two complementary variants: Static Model Filtering (SMF), a lightweight one-pass guard for fast deployment, and Dynamic Agents Filtering (DAF), an agentic-loop-based analyzer that iteratively gathers and synthesizes evidence for ambiguous or long-context attacks. Across AdvBench and GPTFuzz, spanning base and reasoning models under five jailbreak families, both variants reduce attack success rate and toxicity in the evaluated settings. SMF offers an efficient low-latency option, while DAF is more effective when harmful intent is semantically disguised or dispersed across long contexts.

CVOct 21, 2021
Reinforcement Learning Based Optimal Camera Placement for Depth Observation of Indoor Scenes

Yichuan Chen, Manabu Tsukada, Hiroshi Esaki

Exploring the most task-friendly camera setting -- optimal camera placement (OCP) problem -- in tasks that use multiple cameras is of great importance. However, few existing OCP solutions specialize in depth observation of indoor scenes, and most versatile solutions work offline. To this problem, an OCP online solution to depth observation of indoor scenes based on reinforcement learning is proposed in this paper. The proposed solution comprises a simulation environment that implements scene observation and reward estimation using shadow maps and an agent network containing a soft actor-critic (SAC)-based reinforcement learning backbone and a feature extractor to extract features from the observed point cloud layer-by-layer. Comparative experiments with two state-of-the-art optimization-based offline methods are conducted. The experimental results indicate that the proposed system outperforms seven out of ten test scenes in obtaining lower depth observation error. The total error in all test scenes is also less than 90% of the baseline ones. Therefore, the proposed system is more competent for depth camera placement in scenarios where there is no prior knowledge of the scenes or where a lower depth observation error is the main objective.