Mark Fuge

LG
h-index22
20papers
412citations
Novelty47%
AI Score55

20 Papers

AIMay 19Code
EngiAI: A Multi-Agent Framework and Benchmark Suite for LLM-Driven Engineering Design

Gioele Molinari, Florian Felten, Soheyl Massoudi et al.

Large Language Model (LLM) agents are increasingly applied to engineering design tasks, yet existing evaluation frameworks do not adequately address multi-agent systems that combine simulation, retrieval, and manufacturing preparation. We introduce a benchmark suite with three evaluation dimensions: (1) a workflow benchmark with seven prompt styles targeting distinct cognitive demands-including direct tool use, semantic disambiguation, conditional branching, and working-memory tasks; (2) a Retrieval-Augmented Generation (RAG) benchmark with gated scoring isolating retrieval contributions to parameter selection; and (3) an High Performance Computing (HPC) benchmark evaluating end-to-end ML training orchestration on a SLURM cluster. Alongside the benchmark we present EngiAI, a Multi-Agent System (MAS) reference implementation built on LangGraph that operationalizes the benchmark by coordinating seven specialized agents through a supervisor architecture, unifying topology optimization, document retrieval, HPC job orchestration, and 3D printer control. Across four LLM backends and two EngiBench problems, proprietary models achieve 96-97% average task completion on Beams2D, while open-source 4B-parameter models reach 55-78%, with clear generational improvement. Conditional branching proves most challenging, with task completion dropping to 20-53% for the conditional style on Photonics2D. RAG gating confirms near-perfect retrieval-augmented scores ($\approx 1.0$) versus near-zero without retrieval, validating the evaluation design. On HPC orchestration, one model completes all pipeline steps in 100% of runs while another drops to 50%, revealing that multi-step instruction following degrades over long-running workflows.

AIApr 5
2026 Roadmap on Artificial Intelligence and Machine Learning for Smart Manufacturing

Jay Lee, Hanqi Su, Marco Macchi et al.

The evolution of artificial intelligence (AI) and machine learning (ML) is reshaping smart manufacturing by providing new capabilities for efficiency, adaptability, and autonomy across industrial value chains. However, the deployment of AI and ML in industrial settings still faces critical challenges, including the complexity of industrial big data, effective data management, integration with heterogeneous sensing and control systems, and the demand for trustworthy, explainable, and reliable operation in high-stakes industrial environments. In this roadmap, we present a comprehensive perspective on the foundations, applications, and emerging directions of AI and ML in smart manufacturing. It is structured in three parts. The first highlights the foundations and trends that frame the evolution of AI in smart manufacturing. The second focuses on key topics where AI is already enabling advances, including industrial big data analytics, advanced sensing and perception, autonomous systems, additive and laser-based manufacturing, digital twins, robotics, supply chain and logistics optimization, and sustainable manufacturing. The third section explores non-traditional ML approaches that are opening new frontiers, such as physics-informed AI, generative AI, semantic AI, advanced digital twins, explainable AI, RAMS, data-centric metrology, LLMs, and foundation models for highly connected and complex manufacturing systems. By identifying both opportunities and remaining barriers across these areas, this roadmap outlines the advances needed in methods, integration strategies, and industrial adoption. We hope this roadmap will serve as a guide for researchers, engineers, and practitioners to accelerate innovation, align academic and industrial priorities, and ensure that AI-driven smart manufacturing delivers reliable, sustainable, and scalable impact for the future of manufacturing ecosystems.

LGAug 16, 2024
Inverse design with conditional cascaded diffusion models

Milad Habibi, Mark Fuge

Adjoint-based design optimizations are usually computationally expensive and those costs scale with resolution. To address this, researchers have proposed machine learning approaches for inverse design that can predict higher-resolution solutions from lower cost/resolution ones. Due to the recent success of diffusion models over traditional generative models, we extend the use of diffusion models for multi-resolution tasks by proposing the conditional cascaded diffusion model (cCDM). Compared to GANs, cCDM is more stable to train, and each diffusion model within the cCDM can be trained independently, thus each model's parameters can be tuned separately to maximize the performance of the pipeline. Our study compares cCDM against a cGAN model with transfer learning. Our results demonstrate that the cCDM excels in capturing finer details, preserving volume fraction constraints, and minimizing compliance errors in multi-resolution tasks when a sufficient amount of high-resolution training data (more than 102 designs) is available. Furthermore, we explore the impact of training data size on the performance of both models. While both models show decreased performance with reduced high-resolution training data, the cCDM loses its superiority to the cGAN model with transfer learning when training data is limited (less than 102), and we show the break-even point for this transition. Also, we highlight that while the diffusion model may achieve better pixel-wise performance in both low-resolution and high-resolution scenarios, this does not necessarily guarantee that the model produces optimal compliance error or constraint satisfaction.

LGMay 18
Beyond Inference-Time Search: Reinforcement Learning Synthesizes Reusable Solvers

Soheyl Massoudi, Gabriel Apaza, Milad Habibi et al.

Large language models (LLMs) typically approach combinatorial optimization as an inference-time procedure, solving each instance separately through sampling, search, or repeated prompting. We ask whether reinforcement learning can instead shift part of this reasoning cost into the weights of a code LLM, so that the model synthesizes a reusable solver for an entire problem family. We study this question on Synergistic Dependency Selection (SDS), a controlled variant of constrained Quadratic Knapsack designed to expose a specific failure mode: local signals and strict feasibility constraints make greedy heuristics attractive but unreliable. Under identical scaffolding, Best-of-64 base-model sampling saturates at an approximately 28.7% gap to the global Virtual Best Solver (VBS); code audits show that the base model often retrieves Simulated Annealing templates but misimplements the Metropolis acceptance rule. We fine-tune Qwen2.5-Coder-14B-Instruct with Group Relative Policy Optimization (GRPO) using a feasibility-gated reward and light structural scaffolding. The resulting policy converges to a constraint-aware Simulated Annealing template in 99.8% of feasible SDS outputs, achieves a 5.0% gap to that VBS, and is 91 times cheaper in post-generation execution/search cost than cumulative Best-of-64 evaluation. A compile-once check shows that one best frozen solver per seed remains highly competitive when reused unchanged across the SDS test set, while an additional-domain evaluation on Job Shop Scheduling provides narrower but positive evidence that the scaffold transfers beyond SDS. Negative ablations reveal the limits of this recipe: standard stabilizers degrade performance, a soft feasibility gate fails, and results remain sensitive to reward normalization and domain-specific design choices.

CEJun 2, 2025Code
EngiBench: A Framework for Data-Driven Engineering Design Research

Florian Felten, Gabriel Apaza, Gerhard Bräunlich et al.

Engineering design optimization seeks to automatically determine the shapes, topologies, or parameters of components that maximize performance under given conditions. This process often depends on physics-based simulations, which are difficult to install, computationally expensive, and require domain-specific expertise. To mitigate these challenges, we introduce EngiBench, the first open-source library and datasets spanning diverse domains for data-driven engineering design. EngiBench provides a unified API and a curated set of benchmarks -- covering aeronautics, heat conduction, photonics, and more -- that enable fair, reproducible comparisons of optimization and machine learning algorithms, such as generative or surrogate models. We also release EngiOpt, a companion library offering a collection of such algorithms compatible with the EngiBench interface. Both libraries are modular, letting users plug in novel algorithms or problems, automate end-to-end experiment workflows, and leverage built-in utilities for visualization, dataset generation, feasibility checks, and performance analysis. We demonstrate their versatility through experiments comparing state-of-the-art techniques across multiple engineering design problems, an undertaking that was previously prohibitively time-consuming to perform. Finally, we show that these problems pose significant challenges for standard machine learning methods due to highly sensitive and constrained design manifolds.

GRFeb 18, 2025
GrainPaint: A multi-scale diffusion-based generative model for microstructure reconstruction of large-scale objects

Nathan Hoffman, Cashen Diniz, Dehao Liu et al.

Simulation-based approaches to microstructure generation can suffer from a variety of limitations, such as high memory usage, long computational times, and difficulties in generating complex geometries. Generative machine learning models present a way around these issues, but they have previously been limited by the fixed size of their generation area. We present a new microstructure generation methodology leveraging advances in inpainting using denoising diffusion models to overcome this generation area limitation. We show that microstructures generated with the presented methodology are statistically similar to grain structures generated with a kinetic Monte Carlo simulator, SPPARKS.

AIJul 11, 2025
Agentic Large Language Models for Conceptual Systems Engineering and Design

Soheyl Massoudi, Mark Fuge

Early-stage engineering design involves complex, iterative reasoning, yet existing large language model (LLM) workflows struggle to maintain task continuity and generate executable models. We evaluate whether a structured multi-agent system (MAS) can more effectively manage requirements extraction, functional decomposition, and simulator code generation than a simpler two-agent system (2AS). The target application is a solar-powered water filtration system as described in a cahier des charges. We introduce the Design-State Graph (DSG), a JSON-serializable representation that bundles requirements, physical embodiments, and Python-based physics models into graph nodes. A nine-role MAS iteratively builds and refines the DSG, while the 2AS collapses the process to a Generator-Reflector loop. Both systems run a total of 60 experiments (2 LLMs - Llama 3.3 70B vs reasoning-distilled DeepSeek R1 70B x 2 agent configurations x 3 temperatures x 5 seeds). We report a JSON validity, requirement coverage, embodiment presence, code compatibility, workflow completion, runtime, and graph size. Across all runs, both MAS and 2AS maintained perfect JSON integrity and embodiment tagging. Requirement coverage remained minimal (less than 20%). Code compatibility peaked at 100% under specific 2AS settings but averaged below 50% for MAS. Only the reasoning-distilled model reliably flagged workflow completion. Powered by DeepSeek R1 70B, the MAS generated more granular DSGs (average 5-6 nodes) whereas 2AS mode-collapsed. Structured multi-agent orchestration enhanced design detail. Reasoning-distilled LLM improved completion rates, yet low requirements and fidelity gaps in coding persisted.

LGApr 27, 2024
Least Volume Analysis

Qiuyi Chen, Cashen Diniz, Mark Fuge

This paper introduces Least Volume (LV)--a simple yet effective regularization method inspired by geometric intuition--that reduces the number of latent dimensions required by an autoencoder without prior knowledge of the dataset's intrinsic dimensionality. We show that its effectiveness depends on the Lipschitz continuity of the decoder, prove that Principal Component Analysis (PCA) is a linear special case, and demonstrate that LV induces a PCA-like importance ordering in nonlinear models. We extend LV to non-Euclidean settings as Generalized Least Volume (GLV), enabling the integration of label information into the latent representation. To support implementation, we also develop an accompanying Dynamic Pruning algorithm. We evaluate LV on several benchmark problems, demonstrating its effectiveness in dimension reduction. Leveraging this, we reveal the role of low-dimensional latent spaces in data sampling and disentangled representation, and use them to probe the varying topological complexity of various datasets. GLV is further applied to labeled datasets, where it induces a contrastive learning effect in representations of discrete labels. On a continuous-label airfoil dataset, it produces representations that lead to smooth changes in aerodynamic performance, thereby stabilizing downstream optimization.

ROOct 10, 2025
Autonomous Soft Robotic Guidewire Navigation via Imitation Learning

Noah Barnes, Ji Woong Kim, Lingyun Di et al.

In endovascular surgery, endovascular interventionists push a thin tube called a catheter, guided by a thin wire to a treatment site inside the patient's blood vessels to treat various conditions such as blood clots, aneurysms, and malformations. Guidewires with robotic tips can enhance maneuverability, but they present challenges in modeling and control. Automation of soft robotic guidewire navigation has the potential to overcome these challenges, increasing the precision and safety of endovascular navigation. In other surgical domains, end-to-end imitation learning has shown promising results. Thus, we develop a transformer-based imitation learning framework with goal conditioning, relative action outputs, and automatic contrast dye injections to enable generalizable soft robotic guidewire navigation in an aneurysm targeting task. We train the model on 36 different modular bifurcated geometries, generating 647 total demonstrations under simulated fluoroscopy, and evaluate it on three previously unseen vascular geometries. The model can autonomously drive the tip of the robot to the aneurysm location with a success rate of 83% on the unseen geometries, outperforming several baselines. In addition, we present ablation and baseline studies to evaluate the effectiveness of each design and data collection choice. Project website: https://softrobotnavigation.github.io/

LGMay 22, 2024
Bayesian Inverse Problems with Conditional Sinkhorn Generative Adversarial Networks in Least Volume Latent Spaces

Qiuyi Chen, Panagiotis Tsilifis, Mark Fuge

Solving inverse problems in scientific and engineering fields has long been intriguing and holds great potential for many applications, yet most techniques still struggle to address issues such as high dimensionality, nonlinearity and model uncertainty inherent in these problems. Recently, generative models such as Generative Adversarial Networks (GANs) have shown great potential in approximating complex high dimensional conditional distributions and have paved the way for characterizing posterior densities in Bayesian inverse problems, yet the problems' high dimensionality and high nonlinearity often impedes the model's training. In this paper we show how to tackle these issues with Least Volume--a novel unsupervised nonlinear dimension reduction method--that can learn to represent the given datasets with the minimum number of latent variables while estimating their intrinsic dimensions. Once the low dimensional latent spaces are identified, efficient and accurate training of conditional generative models becomes feasible, resulting in a latent conditional GAN framework for posterior inference. We demonstrate the power of the proposed methodology on a variety of applications including inversion of parameters in systems of ODEs and high dimensional hydraulic conductivities in subsurface flow problems, and reveal the impact of the observables' and unobservables' intrinsic dimensions on inverse problems.

CEMar 3, 2021
IH-GAN: A Conditional Generative Model for Implicit Surface-Based Inverse Design of Cellular Structures

Jun Wang, Wei Wayne Chen, Daicong Da et al.

Variable-density cellular structures can overcome connectivity and manufacturability issues of topologically optimized structures, particularly those represented as discrete density maps. However, the optimization of such cellular structures is challenging due to the multiscale design problem. Past work addressing this problem generally either only optimizes the volume fraction of single-type unit cells but ignores the effects of unit cell geometry on properties, or considers the geometry-property relation but builds this relation via heuristics. In contrast, we propose a simple yet more principled way to accurately model the property to geometry mapping using a conditional deep generative model, named Inverse Homogenization Generative Adversarial Network (IH-GAN). It learns the conditional distribution of unit cell geometries given properties and can realize the one-to-many mapping from properties to geometries. We further reduce the complexity of IH-GAN by using the implicit function parameterization to represent unit cell geometries. Results show that our method can 1) generate various unit cells that satisfy given material properties with high accuracy ($R^2$-scores between target properties and properties of generated unit cells $>98\%$) and 2) improve the optimized structural performance over the conventional variable-density single-type structure. In the minimum compliance example, our IH-GAN generated structure achieves a $79.7\%$ reduction in concentrated stress and an extra $3.03\%$ reduction in displacement. In the target deformation examples, our IH-GAN generated structure reduces the target matching error by $86.4\%$ and $79.6\%$ for two test cases, respectively. We also demonstrated that the connectivity issue for multi-type unit cells can be solved by transition layer blending.

CEJun 21, 2020
Airfoil Design Parameterization and Optimization using Bézier Generative Adversarial Networks

Wei Chen, Kevin Chiu, Mark Fuge

Global optimization of aerodynamic shapes usually requires a large number of expensive computational fluid dynamics simulations because of the high dimensionality of the design space. One approach to combat this problem is to reduce the design space dimension by obtaining a new representation. This requires a parametric function that compactly and sufficiently describes useful variation in shapes. We propose a deep generative model, Bézier-GAN, to parameterize aerodynamic designs by learning from shape variations in an existing database. The resulted new parameterization can accelerate design optimization convergence by improving the representation compactness while maintaining sufficient representation capacity. We use the airfoil design as an example to demonstrate the idea and analyze Bézier-GAN's representation capacity and compactness. Results show that Bézier-GAN both (1) learns smooth and realistic shape representations for a wide range of airfoils and (2) empirically accelerates optimization convergence by at least two times compared to state-of-the-art parameterization methods.

AIFeb 25, 2020
Forming Diverse Teams from Sequentially Arriving People

Faez Ahmed, John Dickerson, Mark Fuge

Collaborative work often benefits from having teams or organizations with heterogeneous members. In this paper, we present a method to form such diverse teams from people arriving sequentially over time. We define a monotone submodular objective function that combines the diversity and quality of a team and propose an algorithm to maximize the objective while satisfying multiple constraints. This allows us to balance both how diverse the team is and how well it can perform the task at hand. Using crowd experiments, we show that, in practice, the algorithm leads to large gains in team diversity. Using simulations, we show how to quantify the additional cost of forming diverse teams and how to address the problem of simultaneously maximizing diversity for several attributes (e.g., country of origin, gender). Our method has applications in collaborative work ranging from team formation, the assignment of workers to teams in crowdsourcing, and reviewer allocation to journal papers arriving sequentially. Our code is publicly accessible for further research.

OCJan 12, 2020
Adaptive Expansion Bayesian Optimization for Unbounded Global Optimization

Wei Chen, Mark Fuge

Bayesian optimization is normally performed within fixed variable bounds. In cases like hyperparameter tuning for machine learning algorithms, setting the variable bounds is not trivial. It is hard to guarantee that any fixed bounds will include the true global optimum. We propose a Bayesian optimization approach that only needs to specify an initial search space that does not necessarily include the global optimum, and expands the search space when necessary. However, over-exploration may occur during the search space expansion. Our method can adaptively balance exploration and exploitation in an expanding space. Results on a range of synthetic test functions and an MLP hyperparameter optimization task show that the proposed method out-performs or at least as good as the current state-of-the-art methods.

AISep 7, 2019
An Algorithm for Multi-Attribute Diverse Matching

Saba Ahmadi, Faez Ahmed, John P. Dickerson et al.

Bipartite b-matching, where agents on one side of a market are matched to one or more agents or items on the other, is a classical model that is used in myriad application areas such as healthcare, advertising, education, and general resource allocation. Traditionally, the primary goal of such models is to maximize a linear function of the constituent matches (e.g., linear social welfare maximization) subject to some constraints. Recent work has studied a new goal of balancing whole-match diversity and economic efficiency, where the objective is instead a monotone submodular function over the matching. Basic versions of this problem are solvable in polynomial time. In this work, we prove that the problem of simultaneously maximizing diversity along several features (e.g., country of citizenship, gender, skills) is NP-hard. To address this problem, we develop the first combinatorial algorithm that constructs provably-optimal diverse b-matchings in pseudo-polynomial time. We also provide a Mixed-Integer Quadratic formulation for the same problem and show that our method guarantees optimal solutions and takes less computation time for a reviewer assignment application.

LGAug 27, 2018
BézierGAN: Automatic Generation of Smooth Curves from Interpretable Low-Dimensional Parameters

Wei Chen, Mark Fuge

Many real-world objects are designed by smooth curves, especially in the domain of aerospace and ship, where aerodynamic shapes (e.g., airfoils) and hydrodynamic shapes (e.g., hulls) are designed. To facilitate the design process of those objects, we propose a deep learning based generative model that can synthesize smooth curves. The model maps a low-dimensional latent representation to a sequence of discrete points sampled from a rational Bézier curve. We demonstrate the performance of our method in completing both synthetic and real-world generative tasks. Results show that our method can generate diverse and realistic curves, while preserving consistent shape variation in the latent space, which is favorable for latent space design optimization or design space exploration.

SIJan 30, 2018
Creative Exploration Using Topic Based Bisociative Networks

Faez Ahmed, Mark Fuge

Bisociative knowledge discovery is an approach that combines elements from two or more "incompatible" domains to generate creative solutions and insight. Inspired by Koestler's notion of bisociation, in this paper we propose a computational framework for the discovery of new connections between domains to promote creative discovery and inspiration in design. Specifically, we propose using topic models on a large collection of unstructured text ideas from multiple domains to discover creative sources of inspiration. We use these topics to generate a Bisociative Information Network--- a graph that captures conceptual similarity between ideas--- that helps designers find creative links within that network. Using a dataset of thousands of ideas from OpenIDEO, an online collaborative community, our results show usefulness of representing conceptual bridges through collections of words (topics) in finding cross-domain inspiration. We show that the discovered links between domains, whether presented on their own or via ideas they inspired, are perceived to be more novel and can also be used as creative stimuli for new idea generation.

IRSep 7, 2017
Ranking ideas for diversity and quality

Faez Ahmed, Mark Fuge

When selecting ideas or trying to find inspiration, designers often must sift through hundreds or thousands of ideas. This paper provides an algorithm to rank design ideas such that the ranked list simultaneously maximizes the quality and diversity of recommended designs. To do so, we first define and compare two diversity measures using Determinantal Point Processes (DPP) and additive sub-modular functions. We show that DPPs are more suitable for items expressed as text and that a greedy algorithm diversifies rankings with both theoretical guarantees and empirical performance on what is otherwise an NP-Hard problem. To produce such rankings, this paper contributes a novel way to extend quality and diversity metrics from sets to permutations of ranked lists. These rank metrics open up the use of multi-objective optimization to describe trade-offs between diversity and quality in ranked lists. We use such trade-off fronts to help designers select rankings using indifference curves. However, we also show that rankings on trade-off front share a number of top-ranked items; this means reviewing items (for a given depth like the top 10) from across the entire diversity-to-quality front incurs only a marginal increase in the number of designs considered. While the proposed techniques are general purpose enough to be used across domains, we demonstrate concrete performance on selecting items in an online design community (OpenIDEO), where our approach reduces the time required to review diverse, high-quality ideas from around 25 hours to 90 minutes. This makes evaluation of crowd-generated ideas tractable for a single designer. Our code is publicly accessible for further research.

LGAug 25, 2017
Active Expansion Sampling for Learning Feasible Domains in an Unbounded Input Space

Wei Chen, Mark Fuge

Many engineering problems require identifying feasible domains under implicit constraints. One example is finding acceptable car body styling designs based on constraints like aesthetics and functionality. Current active-learning based methods learn feasible domains for bounded input spaces. However, we usually lack prior knowledge about how to set those input variable bounds. Bounds that are too small will fail to cover all feasible domains; while bounds that are too large will waste query budget. To avoid this problem, we introduce Active Expansion Sampling (AES), a method that identifies (possibly disconnected) feasible domains over an unbounded input space. AES progressively expands our knowledge of the input space, and uses successive exploitation and exploration stages to switch between learning the decision boundary and searching for new feasible domains. We show that AES has a misclassification loss guarantee within the explored region, independent of the number of iterations or labeled samples. Thus it can be used for real-time prediction of samples' feasibility within the explored region. We evaluate AES on three test examples and compare AES with two adaptive sampling methods -- the Neighborhood-Voronoi algorithm and the straddle heuristic -- that operate over fixed input variable bounds.

DSFeb 23, 2017
Diverse Weighted Bipartite b-Matching

Faez Ahmed, John P. Dickerson, Mark Fuge

Bipartite matching, where agents on one side of a market are matched to agents or items on the other, is a classical problem in computer science and economics, with widespread application in healthcare, education, advertising, and general resource allocation. A practitioner's goal is typically to maximize a matching market's economic efficiency, possibly subject to some fairness requirements that promote equal access to resources. A natural balancing act exists between fairness and efficiency in matching markets, and has been the subject of much research. In this paper, we study a complementary goal---balancing diversity and efficiency---in a generalization of bipartite matching where agents on one side of the market can be matched to sets of agents on the other. Adapting a classical definition of the diversity of a set, we propose a quadratic programming-based approach to solving a supermodular minimization problem that balances diversity and total weight of the solution. We also provide a scalable greedy algorithm with theoretical performance bounds. We then define the price of diversity, a measure of the efficiency loss due to enforcing diversity, and give a worst-case theoretical bound. Finally, we demonstrate the efficacy of our methods on three real-world datasets, and show that the price of diversity is not bad in practice.