ROApr 28, 2023Code
Towards autonomous system: flexible modular production system enhanced with large language model agentsYuchen Xia, Manthan Shenoy, Nasser Jazdi et al.
In this paper, we present a novel framework that combines large language models (LLMs), digital twins and industrial automation system to enable intelligent planning and control of production processes. We retrofit the automation system for a modular production facility and create executable control interfaces of fine-granular functionalities and coarse-granular skills. Low-level functionalities are executed by automation components, and high-level skills are performed by automation modules. Subsequently, a digital twin system is developed, registering these interfaces and containing additional descriptive information about the production system. Based on the retrofitted automation system and the created digital twins, LLM-agents are designed to interpret descriptive information in the digital twins and control the physical system through service interfaces. These LLM-agents serve as intelligent agents on different levels within an automation system, enabling autonomous planning and control of flexible production. Given a task instruction as input, the LLM-agents orchestrate a sequence of atomic functionalities and skills to accomplish the task. We demonstrate how our implemented prototype can handle un-predefined tasks, plan a production process, and execute the operations. This research highlights the potential of integrating LLMs into industrial automation systems in the context of smart factory for more agile, flexible, and adaptive production processes, while it also underscores the critical insights and limitations for future work. Demos at: https://github.com/YuchenXia/GPT4IndustrialAutomation
SYSep 26, 2024Code
Control Industrial Automation System with Large Language Model AgentsYuchen Xia, Nasser Jazdi, Jize Zhang et al.
Traditional industrial automation systems require specialized expertise to operate and complex reprogramming to adapt to new processes. Large language models offer the intelligence to make them more flexible and easier to use. However, LLMs' application in industrial settings is underexplored. This paper introduces a framework for integrating LLMs to achieve end-to-end control of industrial automation systems. At the core of the framework are an agent system designed for industrial tasks, a structured prompting method, and an event-driven information modeling mechanism that provides real-time data for LLM inference. The framework supplies LLMs with real-time events on different context semantic levels, allowing them to interpret the information, generate production plans, and control operations on the automation system. It also supports structured dataset creation for fine-tuning on this downstream application of LLMs. Our contribution includes a formal system design, proof-of-concept implementation, and a method for generating task-specific datasets for LLM fine-tuning and testing. This approach enables a more adaptive automation system that can respond to spontaneous events, while allowing easier operation and configuration through natural language for more intuitive human-machine interaction. We provide demo videos and detailed data on GitHub: https://github.com/YuchenXia/LLM4IAS.
AIJul 11, 2024Code
Incorporating Large Language Models into Production Systems for Enhanced Task Automation and FlexibilityYuchen Xia, Jize Zhang, Nasser Jazdi et al.
This paper introduces a novel approach to integrating large language model (LLM) agents into automated production systems, aimed at enhancing task automation and flexibility. We organize production operations within a hierarchical framework based on the automation pyramid. Atomic operation functionalities are modeled as microservices, which are executed through interface invocation within a dedicated digital twin system. This allows for a scalable and flexible foundation for orchestrating production processes. In this digital twin system, low-level, hardware-specific data is semantically enriched and made interpretable for LLMs for production planning and control tasks. Large language model agents are systematically prompted to interpret these production-specific data and knowledge. Upon receiving a user request or identifying a triggering event, the LLM agents generate a process plan. This plan is then decomposed into a series of atomic operations, executed as microservices within the real-world automation system. We implement this overall approach on an automated modular production facility at our laboratory, demonstrating how the LLMs can handle production planning and control tasks through a concrete case study. This results in an intuitive production facility with higher levels of task automation and flexibility. Finally, we reveal the several limitations in realizing the full potential of the large language models in autonomous systems and point out promising benefits. Demos of this series of ongoing research series can be accessed at: https://github.com/YuchenXia/GPT4IndustrialAutomation
AIJul 4, 2022
Intelligent Exploration of Solution Spaces Exemplified by Industrial Reconfiguration ManagementTimo Müller, Benjamin Maschler, Daniel Dittler et al.
Many decision-making approaches rely on the exploration of solution spaces with regards to specified criteria. However, in complex environments, brute-force exploration strategies are usually not feasible. As an alternative, we propose the combination of an exploration task's vertical sub-division into layers representing different sequentially interdependent sub-problems of the paramount problem and a horizontal sub-division into self-sustained solution sub-spaces. In this paper, we present a universal methodology for the intelligent exploration of solution spaces and derive a use-case specific example from the field of reconfiguration management in industry 4.0.
59.1SEApr 3
SDVDiag: Using Context-Aware Causality Mining for the Diagnosis of Connected Vehicle FunctionsMatthias Weiß, Falk Dettinger, Elias Detrois et al.
Real-world implementations of connected vehicle functions are spreading steadily, yet operating these functions reliably remains challenging due to their distributed nature and the complexity of the underlying cloud, edge, and networking infrastructure. Quick diagnosis of problems and understanding the error chains that lead to failures is essential for reducing downtime. However, diagnosing these systems is still largely performed manually, as automated analysis techniques are predominantly data-driven and struggle with hidden relationships and the integration of context information. This paper addresses this gap by introducing a multimodal approach that integrates human feedback and system-specific information into the causal analysis process. Reinforcement Learning from Human Feedback is employed to continuously train a causality mining model while incorporating expert knowledge. Additional modules leverage distributed tracing data to prune false-positive causal links and enable the injection of domain-specific relationships to further refine the causal graph.Evaluation is performed using an automated valet parking application operated in a connected vehicle test field. Results demonstrate a significant increase in precision from 14\% to 100\% for the detection of causal edges and improved system interpretability compared to purely data-driven approaches, highlighting the potential for system operators in the connected vehicle domain.
AIMar 25, 2024Code
Generation of Asset Administration Shell with Large Language Model Agents: Toward Semantic Interoperability in Digital Twins in the Context of Industry 4.0Yuchen Xia, Zhewen Xiao, Nasser Jazdi et al.
This research introduces a novel approach for achieving semantic interoperability in digital twins and assisting the creation of Asset Administration Shell (AAS) as digital twin model within the context of Industry 4.0. The foundational idea of our research is that the communication based on semantics and the generation of meaningful textual data are directly linked, and we posit that these processes are equivalent if the exchanged information can be serialized in text form. Based on this, we construct a "semantic node" data structure in our research to capture the semantic essence of textual data. Then, a system powered by large language models is designed and implemented to process the "semantic node" and generate standardized digital twin models from raw textual data collected from datasheets describing technical assets. Our evaluation demonstrates an effective generation rate of 62-79%, indicating a substantial proportion of the information from the source text can be translated error-free to the target digital twin instance model with the generative capability of large language models. This result has a direct application in the context of Industry 4.0, and the designed system is implemented as a data model generation tool for reducing the manual effort in creating AAS model. In our evaluation, a comparative analysis of different LLMs and an in-depth ablation study of Retrieval-Augmented Generation (RAG) mechanisms provide insights into the effectiveness of LLM systems for interpreting technical concepts and translating data. Our findings emphasize LLMs' capability to automate AAS instance creation and contribute to the broader field of semantic interoperability for digital twins in industrial applications. The prototype implementation and evaluation results are presented on our GitHub Repository: https://github.com/YuchenXia/AASbyLLM.
30.7SEApr 29
Towards Intelligent Computation Offloading in Dynamic Vehicular Networks: A Scalable Multilayer PipelineFalk Dettinger, Matthias Weiß, Baran Can Gül et al.
Software Defined Vehicles face an increasing computational gap as advanced algorithms and frequent software updates demand more processing power while onboard hardware remains static throughout a vehicle's 10+ year lifespan. This mismatch threatens the performance of safety-critical functions including advanced driver-assistance systems and real-time perception tasks. We propose a novel four-layer computation offloading pipeline that dynamically distributes vehicular functions to cloud and edge resources while meeting strict Round Trip Time constraints. Our key contribution is an enhanced Particle Swarm Optimization algorithm that integrates distance- and direction-based penalties with functional requirements to optimize edge server selection for mobile vehicles. Evaluation using a Kubernetes-based cloud infrastructure with realistic vehicular mobility patterns demonstrates that our approach reduces average response time compared to conventional Brute-Force methods while maintaining the success rate for latency-critical tasks. The modified Particle Swarm Optimization algorithm achieves an average execution time of 26 ms across ten servers and tasks on Central Processing Unit, and 550ms across 15 servers with 1000 tasks on Graphics Processing Unit. These results confirm the pipeline's effectiveness in bridging the computational gap for next-generation Software Defined Vehicles (SDV).
LGJun 11, 2025
SyncFed: Time-Aware Federated Learning through Explicit Timestamping and SynchronizationBaran Can Gül, Stefanos Tziampazis, Nasser Jazdi et al.
As Federated Learning (FL) expands to larger and more distributed environments, consistency in training is challenged by network-induced delays, clock unsynchronicity, and variability in client updates. This combination of factors may contribute to misaligned contributions that undermine model reliability and convergence. Existing methods like staleness-aware aggregation and model versioning address lagging updates heuristically, yet lack mechanisms to quantify staleness, especially in latency-sensitive and cross-regional deployments. In light of these considerations, we introduce \emph{SyncFed}, a time-aware FL framework that employs explicit synchronization and timestamping to establish a common temporal reference across the system. Staleness is quantified numerically based on exchanged timestamps under the Network Time Protocol (NTP), enabling the server to reason about the relative freshness of client updates and apply temporally informed weighting during aggregation. Our empirical evaluation on a geographically distributed testbed shows that, under \emph{SyncFed}, the global model evolves within a stable temporal context, resulting in improved accuracy and information freshness compared to round-based baselines devoid of temporal semantics.
LGJul 21, 2025
FedMultiEmo: Real-Time Emotion Recognition via Multimodal Federated LearningBaran Can Gül, Suraksha Nadig, Stefanos Tziampazis et al.
In-vehicle emotion recognition underpins adaptive driver-assistance systems and, ultimately, occupant safety. However, practical deployment is hindered by (i) modality fragility - poor lighting and occlusions degrade vision-based methods; (ii) physiological variability - heart-rate and skin-conductance patterns differ across individuals; and (iii) privacy risk - centralized training requires transmission of sensitive data. To address these challenges, we present FedMultiEmo, a privacy-preserving framework that fuses two complementary modalities at the decision level: visual features extracted by a Convolutional Neural Network from facial images, and physiological cues (heart rate, electrodermal activity, and skin temperature) classified by a Random Forest. FedMultiEmo builds on three key elements: (1) a multimodal federated learning pipeline with majority-vote fusion, (2) an end-to-end edge-to-cloud prototype on Raspberry Pi clients and a Flower server, and (3) a personalized Federated Averaging scheme that weights client updates by local data volume. Evaluated on FER2013 and a custom physiological dataset, the federated Convolutional Neural Network attains 77% accuracy, the Random Forest 74%, and their fusion 87%, matching a centralized baseline while keeping all raw data local. The developed system converges in 18 rounds, with an average round time of 120 seconds and a per-client memory footprint below 200 MB. These results indicate that FedMultiEmo offers a practical approach to real-time, privacy-aware emotion recognition in automotive settings.
CVOct 5, 2021
A Methodology to Identify Cognition Gaps in Visual Recognition Applications Based on Convolutional Neural NetworksHannes Vietz, Tristan Rauch, Andreas Löcklin et al.
Developing consistently well performing visual recognition applications based on convolutional neural networks, e.g. for autonomous driving, is very challenging. One of the obstacles during the development is the opaqueness of their cognitive behaviour. A considerable amount of literature has been published which describes irrational behaviour of trained CNNs showcasing gaps in their cognition. In this paper, a methodology is presented that creates worstcase images using image augmentation techniques. If the CNN's cognitive performance on such images is weak while the augmentation techniques are supposedly harmless, a potential gap in the cognition has been found. The presented worst-case image generator is using adversarial search approaches to efficiently identify the most challenging image. This is evaluated with the well-known AlexNet CNN using images depicting a typical driving scenario.
AIJul 7, 2021
Enhancing an Intelligent Digital Twin with a Self-organized Reconfiguration Management based on Adaptive Process ModelsTimo Müller, Benjamin Lindemann, Tobias Jung et al.
Shorter product life cycles and increasing individualization of production leads to an increased reconfiguration demand in the domain of industrial automation systems, which will be dominated by cyber-physical production systems in the future. In constantly changing systems, however, not all configuration alternatives of the almost infinite state space are fully understood. Thus, certain configurations can lead to process instability, a reduction in quality or machine failures. Therefore, this paper presents an approach that enhances an intelligent Digital Twin with a self-organized reconfiguration management based on adaptive process models in order to find optimized configurations more comprehensively.
LGDec 3, 2020
Transfer Learning as an Enabler of the Intelligent Digital TwinBenjamin Maschler, Dominik Braun, Nasser Jazdi et al.
Digital Twins have been described as beneficial in many areas, such as virtual commissioning, fault prediction or reconfiguration planning. Equipping Digital Twins with artificial intelligence functionalities can greatly expand those beneficial applications or open up altogether new areas of application, among them cross-phase industrial transfer learning. In the context of machine learning, transfer learning represents a set of approaches that enhance learning new tasks based upon previously acquired knowledge. Here, knowledge is transferred from one lifecycle phase to another in order to reduce the amount of data or time needed to train a machine learning algorithm. Looking at common challenges in developing and deploying industrial machinery with deep learning functionalities, embracing this concept would offer several advantages: Using an intelligent Digital Twin, learning algorithms can be designed, configured and tested in the design phase before the physical system exists and real data can be collected. Once real data becomes available, the algorithms must merely be fine-tuned, significantly speeding up commissioning and reducing the probability of costly modifications. Furthermore, using the Digital Twin's simulation capabilities virtually injecting rare faults in order to train an algorithm's response or using reinforcement learning, e.g. to teach a robot, become practically feasible. This article presents several cross-phase industrial transfer learning use cases utilizing intelligent Digital Twins. A real cyber physical production system consisting of an automated welding machine and an automated guided vehicle equipped with a robot arm is used to illustrate the respective benefits.