CL AINov 1, 2025

Data Analysis and Performance Evaluation of Simulation Deduction Based on LLMs

arXiv:2511.10651v11 citationsh-index: 1Proceedings of the 2025 International Symposium on Artificial Intelligence and Computational Social Sciences

Originality Incremental advance

AI Analysis

This work addresses the need for efficient and accurate data analysis in military simulation, though it appears incremental as it builds on existing LLM capabilities with structured prompting and tools.

The paper tackled the problem of generating high-quality analysis reports for simulation deduction in warfare by proposing a method that decomposes tasks, uses multi-round interactions with LLMs, and custom tools, resulting in reports with higher quality scores than a baseline.

Data analysis and performance evaluation of simulation deduction plays a pivotal role in modern warfare, which enables military personnel to gain invaluable insights into the potential effectiveness of different strategies, tactics, and operational plans. Traditional manual analysis approach is time-consuming and limited by human errors. To enhance efficiency and accuracy, large language models (LLMs) with strong analytical and inferencing capabilities can be employed. However, high-quality analysis reports with well-structured formatting cannot be obtained through a single instruction input to the LLM. To tackle this issue, we propose a method that first decomposes the complex task into several sub-tasks and designs effective system prompts and user prompts for each sub-task. Multi-round interactions with the LLM incorporating self-check and reflection are then conducted to enable structured data extraction as well as multi-step analysis and evaluation. Furthermore, custom tools are defined and invoked to generate figures and compute metrics. We also design multiple report templates, each tailored to a specific application and input data type, ensuring their adaptability across a variety of scenarios. Extensive evaluation results demonstrate that the reports generated by our method exhibit higher quality, therefore obtaining higher scores than the baseline method.

View on arXiv PDF

Similar