Xia Wu

CV
h-index17
13papers
122citations
Novelty51%
AI Score51

13 Papers

AIMar 28
daVinci-LLM:Towards the Science of Pretraining

Yiwei Qin, Yixiu Liu, Tiantian Mi et al.

The foundational pretraining phase determines a model's capability ceiling, as post-training struggles to overcome capability foundations established during pretraining, yet it remains critically under-explored. This stems from a structural paradox: organizations with computational resources operate under commercial pressures that inhibit transparent disclosure, while academic institutions possess research freedom but lack pretraining-scale computational resources. daVinci-LLM occupies this unexplored intersection, combining industrial-scale resources with full research freedom to advance the science of pretraining. We adopt a fully-open paradigm that treats openness as scientific methodology, releasing complete data processing pipelines, full training processes, and systematic exploration results. Recognizing that the field lacks systematic methodology for data processing, we employ the Data Darwinism framework, a principled L0-L9 taxonomy from filtering to synthesis. We train a 3B-parameter model from random initialization across 8T tokens using a two-stage adaptive curriculum that progressively shifts from foundational capabilities to reasoning-intensive enhancement. Through 200+ controlled ablations, we establish that: processing depth systematically enhances capabilities, establishing it as a critical dimension alongside volume scaling; different domains exhibit distinct saturation dynamics, necessitating adaptive strategies from proportion adjustments to format shifts; compositional balance enables targeted intensification while preventing performance collapse; how evaluation protocol choices shape our understanding of pretraining progress. By releasing the complete exploration process, we enable the community to build upon our findings and systematic methodologies to form accumulative scientific knowledge in pretraining.

ITMay 12
Optimal Codes with Positive Griesmer Defects, Related Optimal and Almost Optimal LRC Codes

Yurui Wang, Hao Chen, Xia Wu

Solomon and Stiffler constructed infinitely many families of linear codes meeting the Griesmer bound in 1965. It is well-known in 1990's that certain Griesmer codes (codes with the zero Griesmer defect) are equivalent to Solomon-Stiffler codes or Belov codes. Griesmer codes constructed in some recent papers published in IEEE Trans. Inf. Theory are actually Solomon-Stiffler codes or affine Solomon-Stiffler codes proposed in our previous paper. Therefore it is more challenging to construct optimal codes with positive Griesmer defects. In this paper, we construct several infinite families of optimal codes with positive Griesmer defects. Then these codes are certainly not equivalent to Solomon-Stiffler codes or Belov codes. Weight distributions and subcode support weight distributions of these optimal codes are determined. On the other hand, some of constructed optimal linear codes are optimal locally recoverable codes (LRCs) meeting the Cadambe-Mazumdar (CM) bound. Some of our constructed optimal codes are very close to the CM bound. Localities of these optimal or almost optimal LRC codes are two.

AISep 22, 2025
LIMI: Less is More for Agency

Yang Xiao, Mohan Jiang, Jie Sun et al.

We define Agency as the emergent capacity of AI systems to function as autonomous agents actively discovering problems, formulating hypotheses, and executing solutions through self-directed engagement with environments and tools. This fundamental capability marks the dawn of the Age of AI Agency, driven by a critical industry shift: the urgent need for AI systems that don't just think, but work. While current AI excels at reasoning and generating responses, industries demand autonomous agents that can execute tasks, operate tools, and drive real-world outcomes. As agentic intelligence becomes the defining characteristic separating cognitive systems from productive workers, efficiently cultivating machine autonomy becomes paramount. Current approaches assume that more data yields better agency, following traditional scaling laws from language modeling. We fundamentally challenge this paradigm. LIMI (Less Is More for Intelligent Agency) demonstrates that agency follows radically different development principles. Through strategic focus on collaborative software development and scientific research workflows, we show that sophisticated agentic intelligence can emerge from minimal but strategically curated demonstrations of autonomous behavior. Using only 78 carefully designed training samples, LIMI achieves 73.5% on comprehensive agency benchmarks, dramatically outperforming state-of-the-art models: Kimi-K2-Instruct (24.1%), DeepSeek-V3.1 (11.9%), Qwen3-235B-A22B-Instruct (27.5%), and GLM-4.5 (45.1%). Most strikingly, LIMI demonstrates 53.7% improvement over models trained on 10,000 samples-achieving superior agentic intelligence with 128 times fewer samples. Our findings establish the Agency Efficiency Principle: machine autonomy emerges not from data abundance but from strategic curation of high-quality agentic demonstrations.

CVMar 1, 2024
Embedded Multi-label Feature Selection via Orthogonal Regression

Xueyuan Xu, Fulin Wei, Tianyuan Jia et al.

In the last decade, embedded multi-label feature selection methods, incorporating the search for feature subsets into model optimization, have attracted considerable attention in accurately evaluating the importance of features in multi-label classification tasks. Nevertheless, the state-of-the-art embedded multi-label feature selection algorithms based on least square regression usually cannot preserve sufficient discriminative information in multi-label data. To tackle the aforementioned challenge, a novel embedded multi-label feature selection method, termed global redundancy and relevance optimization in orthogonal regression (GRROOR), is proposed to facilitate the multi-label feature selection. The method employs orthogonal regression with feature weighting to retain sufficient statistical and structural information related to local label correlations of the multi-label data in the feature learning process. Additionally, both global feature redundancy and global label relevancy information have been considered in the orthogonal regression model, which could contribute to the search for discriminative and non-redundant feature subsets in the multi-label data. The cost function of GRROOR is an unbalanced orthogonal Procrustes problem on the Stiefel manifold. A simple yet effective scheme is utilized to obtain an optimal solution. Extensive experimental results on ten multi-label data sets demonstrate the effectiveness of GRROOR.

AIAug 4, 2025
Neuromorphic Computing with Multi-Frequency Oscillations: A Bio-Inspired Approach to Artificial Intelligence

Boheng Liu, Ziyu Li, Qing Li et al.

Despite remarkable capabilities, artificial neural networks exhibit limited flexible, generalizable intelligence. This limitation stems from their fundamental divergence from biological cognition that overlooks both neural regions' functional specialization and the temporal dynamics critical for coordinating these specialized systems. We propose a tripartite brain-inspired architecture comprising functionally specialized perceptual, auxiliary, and executive systems. Moreover, the integration of temporal dynamics through the simulation of multi-frequency neural oscillation and synaptic dynamic adaptation mechanisms enhances the architecture, thereby enabling more flexible and efficient artificial cognition. Initial evaluations demonstrate superior performance compared to state-of-the-art temporal processing approaches, with 2.18\% accuracy improvements while reducing required computation iterations by 48.44\%, and achieving higher correlation with human confidence patterns. Though currently demonstrated on visual processing tasks, this architecture establishes a theoretical foundation for brain-like intelligence across cognitive domains, potentially bridging the gap between artificial and biological intelligence.

CVApr 23, 2025
Facial Foundational Model Advances Early Warning of Coronary Artery Disease from Live Videos with DigitalShadow

Juexiao Zhou, Zhongyi Han, Mankun Xin et al.

Global population aging presents increasing challenges to healthcare systems, with coronary artery disease (CAD) responsible for approximately 17.8 million deaths annually, making it a leading cause of global mortality. As CAD is largely preventable, early detection and proactive management are essential. In this work, we introduce DigitalShadow, an advanced early warning system for CAD, powered by a fine-tuned facial foundation model. The system is pre-trained on 21 million facial images and subsequently fine-tuned into LiveCAD, a specialized CAD risk assessment model trained on 7,004 facial images from 1,751 subjects across four hospitals in China. DigitalShadow functions passively and contactlessly, extracting facial features from live video streams without requiring active user engagement. Integrated with a personalized database, it generates natural language risk reports and individualized health recommendations. With privacy as a core design principle, DigitalShadow supports local deployment to ensure secure handling of user data.

CVMar 10, 2025
Brain Inspired Adaptive Memory Dual-Net for Few-Shot Image Classification

Kexin Di, Xiuxing Li, Yuyang Han et al.

Few-shot image classification has become a popular research topic for its wide application in real-world scenarios, however the problem of supervision collapse induced by single image-level annotation remains a major challenge. Existing methods aim to tackle this problem by locating and aligning relevant local features. However, the high intra-class variability in real-world images poses significant challenges in locating semantically relevant local regions under few-shot settings. Drawing inspiration from the human's complementary learning system, which excels at rapidly capturing and integrating semantic features from limited examples, we propose the generalization-optimized Systems Consolidation Adaptive Memory Dual-Network, SCAM-Net. This approach simulates the systems consolidation of complementary learning system with an adaptive memory module, which successfully addresses the difficulty of identifying meaningful features in few-shot scenarios. Specifically, we construct a Hippocampus-Neocortex dual-network that consolidates structured representation of each category, the structured representation is then stored and adaptively regulated following the generalization optimization principle in a long-term memory inside Neocortex. Extensive experiments on benchmark datasets show that the proposed model has achieved state-of-the-art performance.

CVMar 28, 2020
Multiform Fonts-to-Fonts Translation via Style and Content Disentangled Representations of Chinese Character

Fenxi Xiao, Jie Zhang, Bo Huang et al.

This paper mainly discusses the generation of personalized fonts as the problem of image style transfer. The main purpose of this paper is to design a network framework that can extract and recombine the content and style of the characters. These attempts can be used to synthesize the entire set of fonts with only a small amount of characters. The paper combines various depth networks such as Convolutional Neural Network, Multi-layer Perceptron and Residual Network to find the optimal model to extract the features of the fonts character. The result shows that those characters we have generated is very close to real characters, using Structural Similarity index and Peak Signal-to-Noise Ratio evaluation criterions.

CVMar 27, 2020
Automatic Generation of Chinese Handwriting via Fonts Style Representation Learning

Fenxi Xiao, Bo Huang, Xia Wu

In this paper, we propose and end-to-end deep Chinese font generation system. This system can generate new style fonts by interpolation of latent style-related embeding variables that could achieve smooth transition between different style. Our method is simpler and more effective than other methods, which will help to improve the font design efficiency

SPMar 16, 2020
Inverse design of multilayer nanoparticles using artificial neural networks and genetic algorithm

Cankun Qiu, Zhi Luo, Xia Wu et al.

The light scattering of multilayer nanoparticles can be solved by Maxwell equations. However, it is difficult to solve the inverse design of multilayer nanoparticles by using the traditional trial-and-error method. Here, we present a method for forward simulation and inverse design of multilayer nanoparticles. We combine the global search ability of genetic algorithm with the local search ability of neural network. First, the genetic algorithm is used to find a suitable solution, and then the neural network is used to fine-tune it. Due to the non-unique relationship between physical structures and optical responses, we first train a forward neural network, and then it is applied to the inverse design of multilayer nanoparticles. Not only here, this method can easily be extended to predict and find the best design parameters for other optical structures.

IVMar 16, 2020
u-net CNN based fourier ptychography

Yican Chen, Zhi Luo, Xia Wu et al.

Fourier ptychography is a recently explored imaging method for overcoming the diffraction limit of conventional cameras with applications in microscopy and yielding high-resolution images. In order to splice together low-resolution images taken under different illumination angles of coherent light source, an iterative phase retrieval algorithm is adopted. However, the reconstruction procedure is slow and needs a good many of overlap in the Fourier domain for the continuous recorded low-resolution images and is also worse under system aberrations such as noise or random update sequence. In this paper, we propose a new retrieval algorithm that is based on convolutional neural networks. Once well trained, our model can perform high-quality reconstruction rapidly by using the graphics processing unit. The experiments demonstrate that our model achieves better reconstruction results and is more robust under system aberrations.

ROFeb 29, 2020
Robotic Cane as a Soft SuperLimb for Elderly Sit-to-Stand Assistance

Xia Wu, Haiyuan Liu, Ziqi Liu et al.

Many researchers have identified robotics as a potential solution to the aging population faced by many developed and developing countries. If so, how should we address the cognitive acceptance and ambient control of elderly assistive robots through design? In this paper, we proposed an explorative design of an ambient SuperLimb (Supernumerary Robotic Limb) system that involves a pneumatically-driven robotic cane for at-home motion assistance, an inflatable vest for compliant human-robot interaction, and a depth sensor for ambient intention detection. The proposed system aims at providing active assistance during the sit-to-stand transition for at-home usage by the elderly at the bedside, in the chair, and on the toilet. We proposed a modified biomechanical model with a linear cane robot for closed-loop control implementation. We validated the design feasibility of the proposed ambient SuperLimb system including the biomechanical model, our result showed the advantages in reducing lower limb efforts and elderly fall risks, yet the detection accuracy using depth sensing and adjustments on the model still require further research in the future. Nevertheless, we summarized empirical guidelines to support the ambient design of elderly-assistive SuperLimb systems for lower limb functional augmentation.

LGOct 9, 2019
Supervised feature selection with orthogonal regression and feature weighting

Xia Wu, Xueyuan Xu, Jianhong Liu et al.

Effective features can improve the performance of a model, which can thus help us understand the characteristics and underlying structure of complex data. Previous feature selection methods usually cannot keep more local structure information. To address the defects previously mentioned, we propose a novel supervised orthogonal least square regression model with feature weighting for feature selection. The optimization problem of the objection function can be solved by employing generalized power iteration (GPI) and augmented Lagrangian multiplier (ALM) methods. Experimental results show that the proposed method can more effectively reduce the feature dimensionality and obtain better classification results than traditional feature selection methods. The convergence of our iterative method is proved as well. Consequently, the effectiveness and superiority of the proposed method are verified both theoretically and experimentally.