CLSep 19, 2024
FoodPuzzle: Developing Large Language Model Agents as Flavor ScientistsTenghao Huang, Donghee Lee, John Sweeney et al.
Flavor development in the food industry is increasingly challenged by the need for rapid innovation and precise flavor profile creation. Traditional flavor research methods typically rely on iterative, subjective testing, which lacks the efficiency and scalability required for modern demands. This paper presents three contributions to address the challenges. Firstly, we define a new problem domain for scientific agents in flavor science, conceptualized as the generation of hypotheses for flavor profile sourcing and understanding. To facilitate research in this area, we introduce the FoodPuzzle, a challenging benchmark consisting of 978 food items and 1,766 flavor molecules profiles. We propose a novel Scientific Agent approach, integrating in-context learning and retrieval augmented techniques to generate grounded hypotheses in the domain of food science. Experimental results indicate that our model significantly surpasses traditional methods in flavor profile prediction tasks, demonstrating its potential to transform flavor development practices.
CVJan 20Code
CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language ModelsDonghee Lee, Rui Cai, Zhe Zhao
Recent advancements in Large Vision-Language Models (LVLMs) have pushed them closer to becoming general-purpose assistants. Despite their strong performance, LVLMs still struggle with vision-centric tasks such as image classification, underperforming compared to their base vision encoders, which are often CLIP-based models. To address this limitation, we propose Context-Aware Image Representation Prioritization via Ensemble (CARPE), a novel, model-agnostic framework which introduces vision-integration layers and a context-aware ensemble strategy to identify when to prioritize image representations or rely on the reasoning capabilities of the language model. This design enhances the model's ability to adaptively weight visual and textual modalities and enables the model to capture various aspects of image representations, leading to consistent improvements in generalization across classification and vision-language benchmarks. Extensive experiments demonstrate that CARPE not only improves performance on image classification benchmarks but also enhances results across various vision-language benchmarks. Finally, CARPE is designed to be effectively integrated with most open-source LVLMs that consist of a vision encoder and a language model, ensuring its adaptability across diverse architectures.
LGJan 9
GlueNN: gluing patchwise analytic solutions with neural networksDoyoung Kim, Donghee Lee, Hye-Sung Lee et al.
In the analysis of complex physical systems, the objective often extends beyond merely computing a numerical solution to capturing the precise crossover between different regimes and extracting parameters containing meaningful information. However, standard numerical solvers and conventional deep learning approaches, such as Physics-Informed Neural Networks (PINNs), typically operate as black boxes that output solution fields without disentangling the solution into its interpretable constituent parts. In this work, we propose GlueNN, a physics-informed learning framework that decomposes the global solution into interpretable, patchwise analytic components. Rather than approximating the solution directly, GlueNN promotes the integration constants of local asymptotic expansions to learnable, scale-dependent coefficient functions. By constraining these coefficients with the differential equation, the network effectively performs regime transition, smoothly interpolating between asymptotic limits without requiring ad hoc boundary matching. We demonstrate that this coefficient-centric approach reproduces accurate global solutions in various examples and thus directly extracts physical information that is not explicitly available through standard numerical integration.
LGNov 3, 2025
Bulk-boundary decomposition of neural networksDonghee Lee, Hye-Sung Lee, Jaeok Yi
We present the bulk-boundary decomposition as a new framework for understanding the training dynamics of deep neural networks. Starting from the stochastic gradient descent formulation, we show that the Lagrangian can be reorganized into a data-independent bulk term and a data-dependent boundary term. The bulk captures the intrinsic dynamics set by network architecture and activation functions, while the boundary reflects stochastic interactions from training samples at the input and output layers. This decomposition exposes the local and homogeneous structure underlying deep networks. As a natural extension, we develop a field-theoretic formulation of neural dynamics based on this decomposition.