Andreas G. Boudouvis

LG
h-index34
4papers
10citations
Novelty42%
AI Score21

4 Papers

LGSep 27, 2024
Implementing NLPs in industrial process modeling: Addressing Categorical Variables

Eleni D. Koronaki, Geremy Loachamin Suntaxi, Paris Papavasileiou et al.

Important variables of processes are often categorical, i.e. names or labels representing, e.g. categories of inputs, or types of reactors or a sequence of steps. In this work, we use Natural Language Processing Models to derive embeddings of such inputs that represent their actual meaning, or reflect the "distances" between categories, i.e. how similar or dissimilar they are. This is a marked difference from the current standard practice of using binary, or one-hot encoding to replace categorical variables with sequences of ones and zeros. Combined with dimensionality reduction techniques, either linear such as Principal Component Analysis, or nonlinear such as Uniform Manifold Approximation and Projection, the proposed approach leads to a meaningful, low-dimensional feature space. The significance of obtaining meaningful embeddings is illustrated in the context of an industrial coating process for cutting tools that includes both numerical and categorical inputs. In this industrial process, subject matter expertise suggests that the categorical inputs are critical for determining the final outcome but this cannot be taken into account with the current state-of-the-art. The proposed approach enables feature importance which is a marked improvement compared to the current state-of-the-art in the encoding of categorical variables. The proposed approach is not limited to the case-study presented here and is suitable for applications with similar mix of categorical and numerical critical inputs.

NAJun 18, 2010
A Hybrid Boundary Element Method for Elliptic Problems with Singularities

George Pashos, Athanasios G. Papathanasiou, Andreas G. Boudouvis

The singularities that arise in elliptic boundary value problems are treated locally by a singular function boundary integral method. This method extracts the leading singular coefficients from a series expansion that describes the local behavior of the singularity. The method is fitted into the framework of the widely used boundary element method (BEM), forming a hybrid technique, with the BEM computing the solution away from the singularity. Results of the hybrid technique are reported for the Motz problem and compared with the results of the standalone BEM and Galerkin/finite element method (GFEM). The comparison is made in terms of the total flux (i.e. the capacitance in the case of electrostatic problems) on the Dirichlet boundary adjacent to the singularity, which is essentially the integral of the normal derivative of the solution. The hybrid method manages to reduce the error in the computed capacitance by a factor of 10, with respect to the BEM and GFEM.

LGMay 13, 2024
Integrating supervised and unsupervised learning approaches to unveil critical process inputs

Paris Papavasileiou, Dimitrios G. Giovanis, Gabriele Pozzetti et al.

This study introduces a machine learning framework tailored to large-scale industrial processes characterized by a plethora of numerical and categorical inputs. The framework aims to (i) discern critical parameters influencing the output and (ii) generate accurate out-of-sample qualitative and quantitative predictions of production outcomes. Specifically, we address the pivotal question of the significance of each input in shaping the process outcome, using an industrial Chemical Vapor Deposition (CVD) process as an example. The initial objective involves merging subject matter expertise and clustering techniques exclusively on the process output, here, coating thickness measurements at various positions in the reactor. This approach identifies groups of production runs that share similar qualitative characteristics, such as film mean thickness and standard deviation. In particular, the differences of the outcomes represented by the different clusters can be attributed to differences in specific inputs, indicating that these inputs are critical for the production outcome. Leveraging this insight, we subsequently implement supervised classification and regression methods using the identified critical process inputs. The proposed methodology proves to be valuable in scenarios with a multitude of inputs and insufficient data for the direct application of deep learning techniques, providing meaningful insights into the underlying processes.

CHEM-PHMay 24, 2024
Discovering deposition process regimes: leveraging unsupervised learning for process insights, surrogate modeling, and sensitivity analysis

Geremy Loachamín Suntaxi, Paris Papavasileiou, Eleni D. Koronaki et al.

This work introduces a comprehensive approach utilizing data-driven methods to elucidate the deposition process regimes in Chemical Vapor Deposition (CVD) reactors and the interplay of physical mechanism that dominate in each one of them. Through this work, we address three key objectives. Firstly, our methodology relies on process outcomes, derived by a detailed CFD model, to identify clusters of "outcomes" corresponding to distinct process regimes, wherein the relative influence of input variables undergoes notable shifts. This phenomenon is experimentally validated through Arrhenius plot analysis, affirming the efficacy of our approach. Secondly, we demonstrate the development of an efficient surrogate model, based on Polynomial Chaos Expansion (PCE), that maintains accuracy, facilitating streamlined computational analyses. Finally, as a result of PCE, sensitivity analysis is made possible by means of Sobol' indices, that quantify the impact of process inputs across identified regimes. The insights gained from our analysis contribute to the formulation of hypotheses regarding phenomena occurring beyond the transition regime. Notably, the significance of temperature even in the diffusion-limited regime, as evidenced by the Arrhenius plot, suggests activation of gas phase reactions at elevated temperatures. Importantly, our proposed methods yield insights that align with experimental observations and theoretical principles, aiding decision-making in process design and optimization. By circumventing the need for costly and time-consuming experiments, our approach offers a pragmatic pathway towards enhanced process efficiency. Moreover, this study underscores the potential of data-driven computational methods for innovating reactor design paradigms.