Designing Any Imaging System from Natural Language: Agent-Constrained Composition over a Finite Primitive Basis
This addresses the expertise bottleneck in computational imaging, enabling broader scientific community access to prototyping imaging instruments, though it appears incremental as it builds on existing concepts with a novel automated approach.
The paper tackles the problem of designing computational imaging systems, which typically requires weeks of specialist effort, by introducing an automated pipeline that translates natural-language descriptions into validated forward models with bounded reconstruction error. The result is a system that matches expert-library quality (98.1 +/- 4.2%) across 6 real-data modalities and demonstrates compositional reach with 10 novel designs.
Designing a computational imaging system -- selecting operators, setting parameters, validating consistency -- requires weeks of specialist effort per modality, creating an expertise bottleneck that excludes the broader scientific community from prototyping imaging instruments. We introduce spec.md, a structured specification format, and three autonomous agents -- Plan, Judge, and Execute -- that translate a one-sentence natural-language description into a validated forward model with bounded reconstruction error. A design-to-real error theorem decomposes total reconstruction error into five independently bounded terms, each linked to a corrective action. On 6 real-data modalities spanning all 5 carrier families, the automated pipeline matches expert-library quality (98.1 +/- 4.2%). Ten novel designs -- composing primitives into chains from 3D to 5D -- demonstrate compositional reach beyond any single-modality tool.