HCApr 11
SemiConLens: Visual Analytics for 2D Semiconductor DiscoveryKavinda Athapaththu, Shiwei Chen, Yuan Fang et al.
The past few years have witnessed vibrant efforts in discovering new two-dimensional (2D) semiconductor materials from both academia and the industry, due to their promising potential in resolving the severe performance deterioration of traditional semiconductors resulting from condensed silicon thickness. However, existing methods (e.g., Density Functional Theory (DFT) or machine-learning-based approaches) suffer from various challenges such as small datasets, and reliability and trustworthiness issues. To bridge this gap, we propose SemiConLens, a visual analytics approach to combine human expertise with the power of ML to enable effective and reliable 2D semiconductor discovery. Specifically, we first develop a new Correlation Aware Multivariate Imputation (CAMI) method and use ML models like autoencoder, which can better learn from limited data and reveal uncertainty, to address the challenge of sparse data in semiconductivity prediction. Built upon this, our visualization module, consisting of three visualization views with linked interactions, allows material researchers to interactively filter, discover and compare 2D semiconductor candidates. A novel circular glyph design and a new cluster-aware layout optimization approach are proposed to effectively display all the user-configurable key attributes and possible prediction uncertainties of each semiconductor candidate, ensuring a reliable and trustable 2D semiconductor discovery. We assess SemiConLens through quantitative evaluations, expert interviews, and use cases. The results demonstrate SemiConLens's capability to help material researchers conduct effective discovery of desirable 2D semiconductors.
AIApr 7Code
Reason Analogically via Cross-domain Prior Knowledge: An Empirical Study of Cross-domain Knowledge Transfer for In-Context LearningLe Liu, Zhiming Li, Jianzhi Yan et al.
Despite its success, existing in-context learning (ICL) relies on in-domain expert demonstrations, limiting its applicability when expert annotations are scarce. We posit that different domains may share underlying reasoning structures, enabling source-domain demonstrations to improve target-domain inference despite semantic mismatch. To test this hypothesis, we conduct a comprehensive empirical study of different retrieval methods to validate the feasibility of achieving cross-domain knowledge transfer under the in-context learning setting. Our results demonstrate conditional positive transfer in cross-domain ICL. We identify a clear example absorption threshold: beyond it, positive transfer becomes more likely, and additional demonstrations yield larger gains. Further analysis suggests that these gains stem from reasoning structure repair by retrieved cross-domain examples, rather than semantic cues. Overall, our study validates the feasibility of leveraging cross-domain knowledge transfer to improve cross-domain ICL performance, motivating the community to explore designing more effective retrieval approaches for this novel direction.\footnote{Our implementation is available at https://github.com/littlelaska/ICL-TF4LR}
AIApr 7Code
Towards Effective In-context Cross-domain Knowledge Transfer via Domain-invariant-neurons-based RetrievalJianzhi Yan, Zhiming Li, Le Liu et al.
Large language models (LLMs) have made notable progress in logical reasoning, yet still fall short of human-level performance. Current boosting strategies rely on expert-crafted in-domain demonstrations, limiting their applicability in expertise-scarce domains, such as specialized mathematical reasoning, formal logic, or legal analysis. In this work, we demonstrate the feasibility of leveraging cross-domain demonstrating examples to boost the LLMs' reasoning performance. Despite substantial domain differences, many reusable implicit logical structures are shared across domains. In order to effectively retrieve cross-domain examples for unseen domains under investigation, in this work, we further propose an effective retrieval method, called domain-invariant neurons-based retrieval (\textbf{DIN-Retrieval}). Concisely, DIN-Retrieval first summarizes a hidden representation that is universal across different domains. Then, during the inference stage, we use the DIN vector to retrieve structurally compatible cross-domain demonstrations for the in-context learning. Experimental results in multiple settings for the transfer of mathematical and logical reasoning demonstrate that our method achieves an average improvement of 1.8 over the state-of-the-art methods \footnote{Our implementation is available at https://github.com/Leon221220/DIN-Retrieval}.
CLDec 24, 2024Code
Distilling Fine-grained Sentiment Understanding from Large Language ModelsYice Zhang, Guangyu Xie, Hongling Xu et al.
Fine-grained sentiment analysis (FSA) aims to extract and summarize user opinions from vast opinionated text. Recent studies demonstrate that large language models (LLMs) possess exceptional sentiment understanding capabilities. However, directly deploying LLMs for FSA applications incurs high inference costs. Therefore, this paper investigates the distillation of fine-grained sentiment understanding from LLMs into small language models (SLMs). We prompt LLMs to examine and interpret the sentiments of given reviews and then utilize the generated content to pretrain SLMs. Additionally, we develop a comprehensive FSA benchmark to evaluate both SLMs and LLMs. Extensive experiments on this benchmark reveal that: (1) distillation significantly enhances the performance of SLMs in FSA tasks, achieving a 6.00\% improvement in $F_1$-score, and the distilled model can outperform Llama-2-7b with only 220M parameters; (2) distillation equips SLMs with excellent zero-shot sentiment classification capabilities, enabling them to match or even exceed their teacher models. These results suggest that distillation from LLMs is a highly promising direction for FSA. We will release our code, data, and pretrained model weights at https://github.com/HITSZ-HLT/FSA-Distillation.
CLSep 28, 2025Code
Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer RationalesJianzhi Yan, Le Liu, Youcheng Pan et al.
Chain-of-thought (CoT) distillation aims to enhance small language models' (SLMs) reasoning by transferring multi-step reasoning capability from the larger teacher models. However, existing work underestimates rationale quality, focusing primarily on data quantity, which may transfer noisy or incorrect information to the student model. To address the above issues, we proposed \textbf{M}odel-\textbf{O}riented \textbf{R}ationale \textbf{S}election \textbf{D}istillation (MoRSD), which can discern and select high quality rationales for distillation to improve performance further. We further propose a Rationale Difficulty (RD) metric to measure the ability of the student model to generate the correct answer under a given rationale. Compared to the baseline, we achieved 4.6$\%$ average improvement on seven datasets over three tasks, using fewer rationales by controlling their accuracy, diversity, and difficulty. Our results reveal that a small portion of the high quality rationales can enhance the reasoning ability of student models than the entire dataset. Our method promises to be a possible solution for efficient CoT distillation. Our code will be released in https://github.com/Leon221220/MoRSD.
CLSep 26, 2025Code
From Long to Lean: Performance-aware and Adaptive Chain-of-Thought Compression via Multi-round RefinementJianzhi Yan, Le Liu, Youcheng Pan et al.
Chain-of-Thought (CoT) reasoning improves performance on complex tasks but introduces significant inference latency due to verbosity. We propose Multiround Adaptive Chain-of-Thought Compression (MACC), a framework that leverages the token elasticity phenomenon--where overly small token budgets can paradoxically increase output length--to progressively compress CoTs via multiround refinement. This adaptive strategy allows MACC to determine the optimal compression depth for each input. Our method achieves an average accuracy improvement of 5.6 percent over state-of-the-art baselines, while also reducing CoT length by an average of 47 tokens and significantly lowering latency. Furthermore, we show that test-time performance--accuracy and token length--can be reliably predicted using interpretable features like perplexity and compression rate on the training set. Evaluated across different models, our method enables efficient model selection and forecasting without repeated fine-tuning, demonstrating that CoT compression is both effective and predictable. Our code will be released in https://github.com/Leon221220/MACC.
CLJun 26, 2024Code
Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad PredictionYice Zhang, Jie Zeng, Weiming Hu et al.
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review, which is the most representative and challenging task in aspect-based sentiment analysis. A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods. To tackle this issue, we propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels, aiming to filter out mismatches and thereby enhance the effectiveness of self-training. We highlight two critical aspects to ensure the scorer's effectiveness and reliability: the quality of the training dataset and its model architecture. To this end, we create a human-annotated comparison dataset and train a generative model on it using ranking-based objectives. Extensive experiments on public ASQP datasets reveal that using our scorer can greatly and consistently improve the effectiveness of self-training. Moreover, we explore the possibility of replacing humans with large language models for comparison dataset annotation, and experiments demonstrate its feasibility. We release our code and data at https://github.com/HITSZ-HLT/ST-w-Scorer-ABSA .
HCMar 22
When the Chain Breaks: Interactive Diagnosis of LLM Chain-of-Thought Reasoning ErrorsShiwei Chen, Niruthikka Sritharan, Xiaolin Wen et al.
Current Large Language Models (LLMs), especially Large Reasoning Models, can generate Chain-of-Thought (CoT) reasoning traces to illustrate how they produce final outputs, thereby facilitating trust calibration for users. However, these CoT reasoning traces are usually lengthy and tedious, and can contain various issues, such as logical and factual errors, which make it difficult for users to interpret the reasoning traces efficiently and accurately. To address these challenges, we develop an error detection pipeline that combines external fact-checking with symbolic formal logical validation to identify errors at the step level. Building on this pipeline, we propose ReasonDiag, an interactive visualization system for diagnosing CoT reasoning traces. ReasonDiag provides 1) an integrated arc diagram to show reasoning-step distributions and error-propagation patterns, and 2) a hierarchical node-link diagram to visualize high-level reasoning flows and premise dependencies. We evaluate ReasonDiag through a technical evaluation for the error detection pipeline, two case studies, and user interviews with 16 participants. The results indicate that ReasonDiag helps users effectively understand CoT reasoning traces, identify erroneous steps, and determine their root causes.
CVApr 28, 2024
SafePaint: Anti-forensic Image Inpainting with Domain AdaptationDunyun Chen, Xin Liao, Xiaoshuai Wu et al.
Existing image inpainting methods have achieved remarkable accomplishments in generating visually appealing results, often accompanied by a trend toward creating more intricate structural textures. However, while these models excel at creating more realistic image content, they often leave noticeable traces of tampering, posing a significant threat to security. In this work, we take the anti-forensic capabilities into consideration, firstly proposing an end-to-end training framework for anti-forensic image inpainting named SafePaint. Specifically, we innovatively formulated image inpainting as two major tasks: semantically plausible content completion and region-wise optimization. The former is similar to current inpainting methods that aim to restore the missing regions of corrupted images. The latter, through domain adaptation, endeavors to reconcile the discrepancies between the inpainted region and the unaltered area to achieve anti-forensic goals. Through comprehensive theoretical analysis, we validate the effectiveness of domain adaptation for anti-forensic performance. Furthermore, we meticulously crafted a region-wise separated attention (RWSA) module, which not only aligns with our objective of anti-forensics but also enhances the performance of the model. Extensive qualitative and quantitative evaluations show our approach achieves comparable results to existing image inpainting methods while offering anti-forensic capabilities not available in other methods.
LGAug 25, 2025
GEPO: Group Expectation Policy Optimization for Stable Heterogeneous Reinforcement LearningHan Zhang, Ruibin Zheng, Zexuan Yi et al.
As single-center computing approaches power constraints, decentralized training becomes essential. However, traditional Reinforcement Learning (RL) methods, crucial for enhancing large model post-training, cannot adapt to decentralized distributed training due to the tight coupling between parameter learning and rollout sampling. For this, we propose HeteroRL, a heterogeneous RL architecture that decouples these processes, enabling stable training across geographically distributed nodes connected via the Internet. The core component is Group Expectation Policy Optimization (GEPO), an asynchronous RL algorithm robust to latency caused by network delays or heterogeneity in computational resources. Our study reveals that high latency significantly increases KL divergence, leading to higher variance of importance weights and training instability. GEPO mitigates this issue by using group expectation weighting to exponentially reduce the variance of importance weights, with theoretical guarantees. Experiments show GEPO achieves superior stability - only a 3% performance drop from online to 1800s latency-and reduces the best-to-last gap by 85% versus GSPO (1.8 vs. 12.0) while attaining the highest scores, highlighting its effectiveness in decentralized, resource-heterogeneous environments.
CVJun 16, 2025
Learning Event Completeness for Weakly Supervised Video Anomaly DetectionYu Wang, Shiwei Chen
Weakly supervised video anomaly detection (WS-VAD) is tasked with pinpointing temporal intervals containing anomalous events within untrimmed videos, utilizing only video-level annotations. However, a significant challenge arises due to the absence of dense frame-level annotations, often leading to incomplete localization in existing WS-VAD methods. To address this issue, we present a novel LEC-VAD, Learning Event Completeness for Weakly Supervised Video Anomaly Detection, which features a dual structure designed to encode both category-aware and category-agnostic semantics between vision and language. Within LEC-VAD, we devise semantic regularities that leverage an anomaly-aware Gaussian mixture to learn precise event boundaries, thereby yielding more complete event instances. Besides, we develop a novel memory bank-based prototype learning mechanism to enrich concise text descriptions associated with anomaly-event categories. This innovation bolsters the text's expressiveness, which is crucial for advancing WS-VAD. Our LEC-VAD demonstrates remarkable advancements over the current state-of-the-art methods on two benchmark datasets XD-Violence and UCF-Crime.