CLNov 4, 2022
KGLM: Integrating Knowledge Graph Structure in Language Models for Link PredictionJason Youn, Ilias Tagkopoulos
The ability of knowledge graphs to represent complex relationships at scale has led to their adoption for various needs including knowledge representation, question-answering, and recommendation systems. Knowledge graphs are often incomplete in the information they represent, necessitating the need for knowledge graph completion tasks. Pre-trained and fine-tuned language models have shown promise in these tasks although these models ignore the intrinsic information encoded in the knowledge graph, namely the entity and relation types. In this work, we propose the Knowledge Graph Language Model (KGLM) architecture, where we introduce a new entity/relation embedding layer that learns to differentiate distinctive entity and relation types, therefore allowing the model to learn the structure of the knowledge graph. In this work, we show that further pre-training the language models with this additional embedding layer using the triples extracted from the knowledge graph, followed by the standard fine-tuning phase sets a new state-of-the-art performance for the link prediction task on the benchmark datasets.
CEApr 25
Artificial Intelligence for Food InnovationBianca Datta, Markus J. Buehler, Yvonne Chow et al.
Global food systems must deliver nutritious, sustainable foods while sharply reducing environmental impact. Yet, food innovation remains slow, empirical, and fragmented. Artificial intelligence (AI) offers a transformative path to link molecular composition to functional performance, connect chemical structure to sensory outcomes, and accelerate cross-disciplinary innovation across the production pipeline. While broadly applicable to food systems, we focus on sustainable proteins--plant-based, fermentation-derived, and cultivated--as a high-impact testbed for AI-driven closed-loop design. We review the applications, opportunities, and challenges of AI for Food as an emerging discipline that integrates ingredient design, formulation development, fermentation and production, texture analysis, sensory science, manufacturing, and recipe generation. We identify four priorities: advancing scientific machine learning with embedded domain priors, treating food as a programmable biomaterial, building self-driving laboratories for automated discovery, and developing deep reasoning models that integrate nutrition and sustainability. Integrating AI responsibly into the food innovation cycle can accelerate the transition to sustainable food systems and establish a predictive, design-driven science of food for human and planetary health.
CLJan 24, 2023
Semi-Automated Construction of Food Composition Knowledge BaseJason Youn, Fangzhou Li, Ilias Tagkopoulos
A food composition knowledge base, which stores the essential phyto-, micro-, and macro-nutrients of foods is useful for both research and industrial applications. Although many existing knowledge bases attempt to curate such information, they are often limited by time-consuming manual curation processes. Outside of the food science domain, natural language processing methods that utilize pre-trained language models have recently shown promising results for extracting knowledge from unstructured text. In this work, we propose a semi-automated framework for constructing a knowledge base of food composition from the scientific literature available online. To this end, we utilize a pre-trained BioBERT language model in an active learning setup that allows the optimal use of limited training data. Our work demonstrates how human-in-the-loop models are a step toward AI-assisted food systems that scale well to the ever-increasing big data.
CEApr 22
Predicting food taste with bound-driven optimizationPagkratis Tagkopoulos, Dimitris Sfondilis, Ilias Tagkopoulos et al.
The prediction of sensory attributes from ingredient-level formulations is an emerging challenge at the intersection of food science and artificial intelligence. We address the fundamental question of whether the taste of a food can be predicted from its ingredients by treating recipes as composite materials. We apply Hashin--Shtrikman (HS) and Reuss--Voigt (RV) bounds, techniques originally developed for elastic moduli, to predict five taste dimensions (sweetness, sourness, bitterness, umami, saltiness) on a curated dataset of 70 recipes decomposed into 209 ingredient-level taste references with trained-panel ground truth. The bounds provided an additive baseline but systematically under-predict perceived taste: 77\% of actual taste values exceeded the HS upper bound, with the exceedance rate ranging from 26\% (bitterness) to 97\% (saltiness). We traced this gap to specific processing chemistry (Maillard reactions, caramelization, evaporative concentration, protein hydrolysis, and nucleotide synergy) and introduced a hybrid model that augments the HS baseline with eight chemistry-proxy features encoding these mechanisms. Our results show that our interpretable hybrid model eliminates the systematic bias and reduces mean absolute error by 27--62\% for sweetness, sourness, umami, and saltiness while using only 10 interpretable features, achieving performance comparable to a black-box Lasso regression on 115 per-ingredient features. We further demonstrate constrained inverse design via Differential Evolution, recovering ingredient formulations that match target taste profiles subject to compositional bounds.
AIFeb 13
Translating Dietary Standards into Healthy Meals with Minimal SubstitutionsTrevor Chan, Ilias Tagkopoulos
An important goal for personalized diet systems is to improve nutritional quality without compromising convenience or affordability. We present an end-to-end framework that converts dietary standards into complete meals with minimal change. Using the What We Eat in America (WWEIA) intake data for 135,491 meals, we identify 34 interpretable meal archetypes that we then use to condition a generative model and a portion predictor to meet USDA nutritional targets. In comparisons within archetypes, generated meals are better at following recommended daily intake (RDI) targets by 47.0%, while remaining compositionally close to real meals. Our results show that by allowing one to three food substitutions, we were able to create meals that were 10% more nutritious, while reducing costs 19-32%, on average. By turning dietary guidelines into realistic, budget-aware meals and simple swaps, this framework can underpin clinical decision support, public-health programs, and consumer apps that deliver scalable, equitable improvements in everyday nutrition.
CYNov 17, 2025
The Future of Food: How Artificial Intelligence is Transforming Food ManufacturingXu Zhou, Ivor Prado, AIFPDS participants et al.
Artificial intelligence is accelerating a new era of food innovation, connecting data from farm to consumer to improve formulation, processing, and health outcomes. Recent advances in deep learning, natural language processing, and multi-omics integration make it possible to understand and optimize food systems with unprecedented depth. However, AI adoption across the food sector remains uneven due to heterogeneous datasets, limited model and system interoperability, and a persistent skills gap between data scientists and food domain experts. To address these challenges and advance responsible innovation, the AI Institute for Next Generation Food Systems (AIFS) convened the inaugural AI for Food Product Development Symposium at University of California, Davis, in October 2025. This white paper synthesizes insights from the symposium, organized around five domains where AI can have the greatest near-term impact: supply chain; formulation and processing; consumer insights and sensory prediction; nutrition and health; and education and workforce development. Across the areas, participants emphasized the importance of interoperable data standards, transparent and interpretable models, and cross-sector collaboration to accelerate the translation of AI research into practice. The discussions further highlighted the need for robust digital infrastructure, privacy-preserving data-sharing mechanisms, and interdisciplinary training pathways that integrate AI literacy with domain expertise. Collectively, the priorities outline a roadmap for integrating AI into food manufacturing in ways that enhance innovation, sustainability, and human well-being while ensuring that technological progress remains grounded in ethics, scientific rigor, and societal benefit.
AIApr 8, 2025
SkillFlow: Efficient Skill and Code Transfer Through Communication in Adapting AI AgentsPagkratios Tagkopoulos, Fangzhou Li, Ilias Tagkopoulos
AI agents are autonomous systems that can execute specific tasks based on predefined programming. Here, we present SkillFlow, a modular, technology-agnostic framework that allows agents to expand their functionality in an ad-hoc fashion by acquiring new skills from their environment or other agents. We present a theoretical model that examines under which conditions this framework would be beneficial, and we then explore SkillFlow's ability to accelerate task completion and lead to lower cumulative costs in a real-world application, namely scheduling agents for calendar events. We demonstrate that within a few iterations, SkillFlow leads to considerable (24.8%, p-value = $6.4\times10^{-3}$) gains in time and cost, especially when the communication cost is high. Finally, we draw analogies from well-studied biological systems and compare this framework to that of lateral gene transfer, a significant process of adaptation and evolution in novel environments.
LGOct 17, 2024
Interpreting Inflammation Prediction Model via Tag-based Cohort ExplanationFanyu Meng, Jules Larke, Xin Liu et al.
Machine learning is revolutionizing nutrition science by enabling systems to learn from data and make intelligent decisions. However, the complexity of these models often leads to challenges in understanding their decision-making processes, necessitating the development of explainability techniques to foster trust and increase model transparency. An under-explored type of explanation is cohort explanation, which provides explanations to groups of instances with similar characteristics. Unlike traditional methods that focus on individual explanations or global model behavior, cohort explainability bridges the gap by providing unique insights at an intermediate granularity. We propose a novel framework for identifying cohorts within a dataset based on local feature importance scores, aiming to generate concise descriptions of the clusters via tags. We evaluate our framework on a food-based inflammation prediction model and demonstrated that the framework can generate reliable explanations that match domain knowledge.
CVJun 1, 2020
BWCNN: Blink to Word, a Real-Time Convolutional Neural Network ApproachAlbara Ah Ramli, Rex Liu, Rahul Krishnamoorthy et al.
Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease of the brain and the spinal cord, which leads to paralysis of motor functions. Patients retain their ability to blink, which can be used for communication. Here, We present an Artificial Intelligence (AI) system that uses eye-blinks to communicate with the outside world, running on real-time Internet-of-Things (IoT) devices. The system uses a Convolutional Neural Network (CNN) to find the blinking pattern, which is defined as a series of Open and Closed states. Each pattern is mapped to a collection of words that manifest the patient's intent. To investigate the best trade-off between accuracy and latency, we investigated several Convolutional Network architectures, such as ResNet, SqueezeNet, DenseNet, and InceptionV3, and evaluated their performance. We found that the InceptionV3 architecture, after hyper-parameter fine-tuning on the specific task led to the best performance with an accuracy of 99.20% and 94ms latency. This work demonstrates how the latest advances in deep learning architectures can be adapted for clinical systems that ameliorate the patient's quality of life regardless of the point-of-care.
LGNov 18, 2015
A Distribution Adaptive Framework for Prediction Interval Estimation Using Nominal VariablesAmeen Eetemadi, Ilias Tagkopoulos
Proposed methods for prediction interval estimation so far focus on cases where input variables are numerical. In datasets with solely nominal input variables, we observe records with the exact same input $x^u$, but different real valued outputs due to the inherent noise in the system. Existing prediction interval estimation methods do not use representations that can accurately model such inherent noise in the case of nominal inputs. We propose a new prediction interval estimation method tailored for this type of data, which is prevalent in biology and medicine. We call this method Distribution Adaptive Prediction Interval Estimation given Nominal inputs (DAPIEN) and has four main phases. First, we select a distribution function that can best represent the inherent noise of the system for all unique inputs. Then we infer the parameters $θ_i$ (e.g. $θ_i=[mean_i, variance_i]$) of the selected distribution function for all unique input vectors $x^u_i$ and generate a new corresponding training set using pairs of $x^u_i, θ_i$. III). Then, we train a model to predict $θ$ given a new $x_u$. Finally, we calculate the prediction interval for a new sample using the inverse of the cumulative distribution function once the parameters $θ$ is predicted by the trained model. We compared DAPIEN to the commonly used Bootstrap method on three synthetic datasets. Our results show that DAPIEN provides tighter prediction intervals while preserving the requested coverage when compared to Bootstrap. This work can facilitate broader usage of regression methods in medicine and biology where it is necessary to provide tight prediction intervals while preserving coverage when input variables are nominal.