MAFeb 6
Lemon Agent Technical ReportHaipeng Jiang, Kailong Ren, Zimo Yin et al.
Recent advanced LLM-powered agent systems have exhibited their remarkable capabilities in tackling complex, long-horizon tasks. Nevertheless, they still suffer from inherent limitations in resource efficiency, context management, and multimodal perception. Based on these observations, Lemon Agent is introduced, a multi-agent orchestrator-worker system built on a newly proposed AgentCortex framework, which formalizes the classic Planner-Executor-Memory paradigm through an adaptive task execution mechanism. Our system integrates a hierarchical self-adaptive scheduling mechanism that operates at both the overall orchestrator layer and workers layer. This mechanism can dynamically adjust computational intensity based on task complexity. It enables orchestrator to allocate one or more workers for parallel subtask execution, while workers can further improve operational efficiency by invoking tools concurrently. By virtue of this two-tier architecture, the system achieves synergistic balance between global task coordination and local task execution, thereby optimizing resource utilization and task processing efficiency in complex scenarios. To reduce context redundancy and increase information density during parallel steps, we adopt a three-tier progressive context management strategy. To make fuller use of historical information, we propose a self-evolving memory system, which can extract multi-dimensional valid information from all historical experiences to assist in completing similar tasks. Furthermore, we provide an enhanced MCP toolset. Empirical evaluations on authoritative benchmarks demonstrate that our Lemon Agent can achieve a state-of-the-art 91.36% overall accuracy on GAIA and secures the top position on the xbench-DeepSearch leaderboard with a score of 77+.
LGNov 13, 2023
Probabilistic Physics-integrated Neural Differentiable Modeling for Isothermal Chemical Vapor Infiltration ProcessDeepak Akhare, Zeping Chen, Richard Gulotty et al.
Chemical vapor infiltration (CVI) is a widely adopted manufacturing technique used in producing carbon-carbon and carbon-silicon carbide composites. These materials are especially valued in the aerospace and automotive industries for their robust strength and lightweight characteristics. The densification process during CVI critically influences the final performance, quality, and consistency of these composite materials. Experimentally optimizing the CVI processes is challenging due to long experimental time and large optimization space. To address these challenges, this work takes a modeling-centric approach. Due to the complexities and limited experimental data of the isothermal CVI densification process, we have developed a data-driven predictive model using the physics-integrated neural differentiable (PiNDiff) modeling framework. An uncertainty quantification feature has been embedded within the PiNDiff method, bolstering the model's reliability and robustness. Through comprehensive numerical experiments involving both synthetic and real-world manufacturing data, the proposed method showcases its capability in modeling densification during the CVI process. This research highlights the potential of the PiNDiff framework as an instrumental tool for advancing our understanding, simulation, and optimization of the CVI manufacturing process, particularly when faced with sparse data and an incomplete description of the underlying physics.
LGOct 11, 2025Code
SGM: A Statistical Godel Machine for Risk-Controlled Recursive Self-ModificationXuening Wu, Shenqin Yin, Yanlan Kang et al.
Recursive self-modification is increasingly central in AutoML, neural architecture search, and adaptive optimization, yet no existing framework ensures that such changes are made safely. Godel machines offer a principled safeguard by requiring formal proofs of improvement before rewriting code; however, such proofs are unattainable in stochastic, high-dimensional settings. We introduce the Statistical Godel Machine (SGM), the first statistical safety layer for recursive edits. SGM replaces proof-based requirements with statistical confidence tests (e-values, Hoeffding bounds), admitting a modification only when superiority is certified at a chosen confidence level, while allocating a global error budget to bound cumulative risk across rounds.We also propose Confirm-Triggered Harmonic Spending (CTHS), which indexes spending by confirmation events rather than rounds, concentrating the error budget on promising edits while preserving familywise validity.Experiments across supervised learning, reinforcement learning, and black-box optimization validate this role: SGM certifies genuine gains on CIFAR-100, rejects spurious improvement on ImageNet-100, and demonstrates robustness on RL and optimization benchmarks.Together, these results position SGM as foundational infrastructure for continual, risk-aware self-modification in learning systems.Code is available at: https://github.com/gravitywavelet/sgm-anon.
HCMay 7
Human-AI Co-Evolution and Epistemic Collapse: A Dynamical Systems PerspectiveXuening Wu, Yanlan Kang, Qianya Xu et al.
Large language models (LLMs) are reshaping how knowledge is produced, with increasing reliance on AI systems for generation, summarization, and reasoning. While prior work has studied cognitive offloading in humans and model collapse in recursive training, these effects are typically considered in isolation. We propose a unified perspective: humans and language models form a coupled dynamical system linked by a feedback loop of usage, generation, and retraining. We introduce a minimal model with three variables -- human cognition, data quality, and model capability -- and show that this feedback can give rise to distinct dynamical regimes. Our analysis identifies three regimes: co-evolutionary enhancement, fragile equilibrium, and degenerative convergence. Through a simple simulation, we demonstrate that increasing reliance on AI can induce a transition toward a low-diversity, suboptimal equilibrium. From an information-theoretic perspective, this transition corresponds to an emergent information bottleneck in the human-AI loop, where entropy reduction reflects loss of diversity and support under closed-loop feedback rather than beneficial compression. These results suggest that the trajectory of AI systems is shaped not only by model design, but by the dynamics of human-AI co-evolution.
LGSep 1, 2025
Multi-Modal Machine Learning Framework for Predicting Early Recurrence of Brain Tumors Using MRI and Clinical BiomarkersCheng Cheng, Zeping Chen, Rui Xie et al.
Accurately predicting early recurrence in brain tumor patients following surgical resection remains a clinical challenge. This study proposes a multi-modal machine learning framework that integrates structural MRI features with clinical biomarkers to improve postoperative recurrence prediction. We employ four machine learning algorithms -- Gradient Boosting Machine (GBM), Random Survival Forest (RSF), CoxBoost, and XGBoost -- and validate model performance using concordance index (C-index), time-dependent AUC, calibration curves, and decision curve analysis. Our model demonstrates promising performance, offering a potential tool for risk stratification and personalized follow-up planning.
LGSep 1, 2025
A Multimodal Deep Learning Framework for Early Diagnosis of Liver Cancer via Optimized BiLSTM-AM-VMD ArchitectureCheng Cheng, Zeping Chen, Xavier Wang
This paper proposes a novel multimodal deep learning framework integrating bidirectional LSTM, multi-head attention mechanism, and variational mode decomposition (BiLSTM-AM-VMD) for early liver cancer diagnosis. Using heterogeneous data that include clinical characteristics, biochemical markers, and imaging-derived variables, our approach improves both prediction accuracy and interpretability. Experimental results on real-world datasets demonstrate superior performance over traditional machine learning and baseline deep learning models.
LGApr 19, 2025
Predicting Stress and Damage in Carbon Fiber-Reinforced Composites Deformation Process using Composite U-Net Surrogate ModelZeping Chen, Marwa Yacouti, Maryam Shakiba et al.
Carbon fiber-reinforced composites (CFRC) are pivotal in advanced engineering applications due to their exceptional mechanical properties. A deep understanding of CFRC behavior under mechanical loading is essential for optimizing performance in demanding applications such as aerospace structures. While traditional Finite Element Method (FEM) simulations, including advanced techniques like Interface-enriched Generalized FEM (IGFEM), offer valuable insights, they can struggle with computational efficiency. Existing data-driven surrogate models partially address these challenges by predicting propagated damage or stress-strain behavior but fail to comprehensively capture the evolution of stress and damage throughout the entire deformation history, including crack initiation and propagation. This study proposes a novel auto-regressive composite U-Net deep learning model to simultaneously predict stress and damage fields during CFRC deformation. By leveraging the U-Net architecture's ability to capture spatial features and integrate macro- and micro-scale phenomena, the proposed model overcomes key limitations of prior approaches. The model achieves high accuracy in predicting evolution of stress and damage distribution within the microstructure of a CFRC under unidirectional strain, offering a speed-up of over 60 times compared to IGFEM.