MAMar 29

Sci-Mind: Cognitively-Inspired Adversarial Debate for Autonomous Mathematical Modeling

Ruiying Sun, Wenjing Wang, Qinhan Chen, Yanhui Song, Huangwei Chen, Haotong Luan, Junhao Jia

arXiv:2603.2758465.7h-index: 3

Predicted impact top 33% in MA · last 90 daysOriginality Incremental advance

AI Analysis

For researchers and practitioners using LLMs for scientific modeling, Sci-Mind addresses the critical problem of generating plausible but flawed models by incorporating domain grounding and adversarial verification.

Sci-Mind introduces a cognitively-inspired adversarial debate framework for autonomous mathematical modeling, integrating experiential memory recall and a Theorist-Pragmatist dialectic to improve model validity. On MM-Bench and EngiBench, it significantly outperforms leading agents in modeling rigorousness and code executability.

Real-world mathematical modeling is inherently an experiential and collaborative endeavor. Domain experts rarely solve complex problems from scratch; instead, they draw upon analogies from historical cases and subject their hypotheses to rigorous peer scrutiny. However, autonomous agents powered by Large Language Models predominantly rely on isolated reasoning paradigms, frequently generating plausible but fundamentally flawed models due to a lack of domain grounding and adversarial verification. To address these limitations, we propose Sci-Mind, a novel framework that mirrors the human scientific discovery process. Sci-Mind integrates Experiential Memory Recall to retrieve executable code snippets and modeling paradigm descriptors, grounding abstract reasoning in historical solutions. Subsequently, it employs an Adversarial Cognitive Dialectic where a Theorist optimizing mathematical coherence and a Pragmatist enforcing data feasibility debate through competing objectives to prune elegant but infeasible formulations. A Self-Validating Execution Strategy further ensures blueprint consistency through formal predicates before code generation, achieving fully autonomous execution. Extensive experiments on the MM-Bench and EngiBench benchmarks demonstrate that Sci-Mind significantly outperforms leading autonomous agents in both modeling rigorousness and code executability.

View on arXiv PDF

Similar