AISep 22, 2025

An N-Plus-1 GPT Agency for Critical Solution of Mechanical Engineering Analysis Problems

arXiv:2509.18229v1h-index: 37
Originality Incremental advance
AI Analysis

This addresses the problem of unreliable AI solutions in mechanical engineering education and practice, though it is incremental as it builds on existing multi-agent concepts.

The paper tackles the unreliability of GPT in solving mechanical engineering problems by introducing an 'N-Plus-1' GPT agency that uses multiple independent agents to propose solutions and a comparator agent to select the best one, achieving high probability of correctness based on Condorcet's Jury Theorem.

Generative AI, and specifically GPT, can produce a remarkable solution to a mechanical engineering analysis problem - but also, on occasion, a flawed solution. For example, an elementary mechanics problem is solved flawlessly in one GPT instance and incorrectly in a subsequent GPT instance, with a success probability of only 85%. This unreliability renders "out-of-the-box" GPT unsuitable for deployment in education or engineering practice. We introduce an "N-Plus-1" GPT Agency for Initial (Low-Cost) Analysis of mechanical engineering Problem Statements. Agency first launches N instantiations of Agent Solve to yield N independent Proposed Problem Solution Realizations; Agency then invokes Agent Compare to summarize and compare the N Proposed Problem Solution Realizations and to provide a Recommended Problem Solution. We argue from Condorcet's Jury Theorem that, for a Problem Statement characterized by per-Solve success probability greater than 1/2 (and N sufficiently large), the Predominant (Agent Compare) Proposed Problem Solution will, with high probability, correspond to a Correct Proposed Problem Solution. Furthermore, Agent Compare can also incorporate aspects of Secondary (Agent Compare) Proposed Problem Solutions, in particular when the latter represent alternative Problem Statement interpretations - different Mathematical Models - or alternative Mathematical Solution Procedures. Comparisons to Grok Heavy, a commercial multi-agent model, show similarities in design and performance, but also important differences in emphasis: our Agency focuses on transparency and pedagogical value.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes