AIMar 24, 2023
Learning to Operate in Open Worlds by Adapting Planning ModelsWiktor Piotrowski, Roni Stern, Yoni Sher et al.
Planning agents are ill-equipped to act in novel situations in which their domain model no longer accurately represents the world. We introduce an approach for such agents operating in open worlds that detects the presence of novelties and effectively adapts their domain models and consequent action selection. It uses observations of action execution and measures their divergence from what is expected, according to the environment model, to infer existence of a novelty. Then, it revises the model through a heuristics-guided search over model changes. We report empirical evaluations on the CartPole problem, a standard Reinforcement Learning (RL) benchmark. The results show that our approach can deal with a class of novelties very quickly and in an interpretable fashion.
84.4LGMay 8Code
CarCrashNet: A Large-Scale Dataset and Hierarchical Neural Solver for Data-Driven Structural Crash SimulationMohamed Elrefaie, Dule Shu, Matthew Klenk et al.
Crash simulation is a cornerstone of modern vehicle development because it reduces the need for costly physical prototypes, accelerates safety-driven design iteration, and increasingly supports virtual testing workflows. At the same time, modeling structural crash mechanics remains exceptionally challenging: the response is governed by nonlinear contact, large deformation, material plasticity, failure, and complex multi-body interactions evolving over space and time on high-resolution finite-element meshes. In this work, we introduce \textsc{CarCrashNet}, a public high-fidelity open-source benchmark for data-driven structural crash simulation. \textsc{CarCrashNet} combines component-scale and full-vehicle simulations in a multi-modal format, including more than 14{,}000 bumper-beam pole-impact simulations with varying geometry, materials, and boundary conditions, together with 825 full-vehicle crash simulations built from three industry-standard vehicle models of increasing structural complexity: Dodge Neon, Toyota Yaris, and Chevrolet Silverado. To establish the reliability of the benchmark, we validate our open-source finite-element workflow based on OpenRadioss against both experimental crash data and the commercial solver Ansys LS-DYNA. We also introduce \textsc{CrashSolver}, a machine-learning model designed for full-vehicle crash prediction from high-resolution finite-element crash data. We further perform extensive benchmarking across the released datasets and evaluate \textsc{CrashSolver} against state-of-the-art geometric deep learning and transformer-based neural solvers. Our results position \textsc{CarCrashNet} as a foundation for reproducible research in structural simulation, crashworthiness modeling, and AI-driven virtual crash testing. The dataset is available at https://github.com/Mohamedelrefaie/CarCrashNet.
AIOct 21, 2022
Analogical Concept Memory for Architectures Implementing the Common Model of CognitionShiwali Mohan, Matthew Klenk
Architectures that implement the Common Model of Cognition - Soar, ACT-R, and Sigma - have a prominent place in research on cognitive modeling as well as on designing complex intelligent agents. In this paper, we explore how computational models of analogical processing can be brought into these architectures to enable concept acquisition from examples obtained interactively. We propose a new analogical concept memory for Soar that augments its current system of declarative long-term memories. We frame the problem of concept learning as embedded within the larger context of interactive task learning (ITL) and embodied language processing (ELP). We demonstrate that the analogical learning methods implemented in the proposed memory can quickly learn a diverse types of novel concepts that are useful not only in recognition of a concept in the environment but also in action selection. Our approach has been instantiated in an implemented cognitive system AILEEN and evaluated on a simulated robotic domain.
HCFeb 5
Personagram: Bridging Personas and Product Design for Creative Ideation with Multimodal LLMsTaewook Kim, Matthew K. Hong, Yan-Ying Chen et al.
Product designers often begin their design process with handcrafted personas. While personas are intended to ground design decisions in consumer preferences, they often fall short in practice by remaining abstract, expensive to produce, and difficult to translate into actionable design features. As a result, personas risk serving as static reference points rather than tools that actively shape design outcomes. To address these challenges, we built Personagram, an interactive system powered by multimodal large language models (MLLMs) that helps designers explore detailed census-based personas, extract product features inferred from persona attributes, and recombine them for specific customer segments. In a study with 12 professional designers, we show that Personagram facilitates more actionable ideation workflows by structuring multimodal thinking from persona attributes to product design features, achieving higher engagement with personas, perceived transparency, and satisfaction compared to a chat-based baseline. We discuss implications of integrating AI-generated personas into product design workflows.
LGAug 24, 2025
ShaLa: Multimodal Shared Latent Space ModellingJiali Cui, Yan-Ying Chen, Yanxia Zhang et al.
This paper presents a novel generative framework for learning shared latent representations across multimodal data. Many advanced multimodal methods focus on capturing all combinations of modality-specific details across inputs, which can inadvertently obscure the high-level semantic concepts that are shared across modalities. Notably, Multimodal VAEs with low-dimensional latent variables are designed to capture shared representations, enabling various tasks such as joint multimodal synthesis and cross-modal inference. However, multimodal VAEs often struggle to design expressive joint variational posteriors and suffer from low-quality synthesis. In this work, ShaLa addresses these challenges by integrating a novel architectural inference model and a second-stage expressive diffusion prior, which not only facilitates effective inference of shared latent representation but also significantly improves the quality of downstream multimodal synthesis. We validate ShaLa extensively across multiple benchmarks, demonstrating superior coherence and synthesis quality compared to state-of-the-art multimodal VAEs. Furthermore, ShaLa scales to many more modalities while prior multimodal VAEs have fallen short in capturing the increasing complexity of the shared latent space.
CVMar 15, 2025
Att-Adapter: A Robust and Precise Domain-Specific Multi-Attributes T2I Diffusion Adapter via Conditional Variational AutoencoderWonwoong Cho, Yan-Ying Chen, Matthew Klenk et al.
Text-to-Image (T2I) Diffusion Models have achieved remarkable performance in generating high quality images. However, enabling precise control of continuous attributes, especially multiple attributes simultaneously, in a new domain (e.g., numeric values like eye openness or car width) with text-only guidance remains a significant challenge. To address this, we introduce the Attribute (Att) Adapter, a novel plug-and-play module designed to enable fine-grained, multi-attributes control in pretrained diffusion models. Our approach learns a single control adapter from a set of sample images that can be unpaired and contain multiple visual attributes. The Att-Adapter leverages the decoupled cross attention module to naturally harmonize the multiple domain attributes with text conditioning. We further introduce Conditional Variational Autoencoder (CVAE) to the Att-Adapter to mitigate overfitting, matching the diverse nature of the visual world. Evaluations on two public datasets show that Att-Adapter outperforms all LoRA-based baselines in controlling continuous attributes. Additionally, our method enables a broader control range and also improves disentanglement across multiple attributes, surpassing StyleGAN-based techniques. Notably, Att-Adapter is flexible, requiring no paired synthetic data for training, and is easily scalable to multiple attributes within a single model.
LGSep 10, 2021
Machine learning reveals how personalized climate communication can both succeed and backfireTotte Harinen, Alexandre Filipowicz, Shabnam Hakimi et al.
Different advertising messages work for different people. Machine learning can be an effective way to personalise climate communications. In this paper we use machine learning to reanalyse findings from a recent study, showing that online advertisements increased some people's belief in climate change while resulting in decreased belief in others. In particular, we show that the effect of the advertisements could change depending on people's age and ethnicity.
AIJul 9, 2021
Playing Angry Birds with a Domain-Independent PDDL+ PlannerWiktor Piotrowski, Roni Stern, Matthew Klenk et al.
This demo paper presents the first system for playing the popular Angry Birds game using a domain-independent planner. Our system models Angry Birds levels using PDDL+, a planning language for mixed discrete/continuous domains. It uses a domain-independent PDDL+ planner to generate plans and executes them. In this demo paper, we present the system's PDDL+ model for this domain, identify key design decisions that reduce the problem complexity, and compare the performance of our system to model-specific methods for this domain. The results show that our system's performance is on par with other domain-specific systems for Angry Birds, suggesting the applicability of domain-independent planning to this benchmark AI challenge.
AISep 25, 2019
An Extensible and Personalizable Multi-Modal Trip PlannerXudong Liu, Christian Fritz, Matthew Klenk
Despite a tremendous amount of work in the literature and in the commercial sectors, current approaches to multi-modal trip planning still fail to consistently generate plans that users deem optimal in practice. We believe that this is due to the fact that current planners fail to capture the true preferences of users, e.g., their preferences depend on aspects that are not modeled. An example of this could be a preference not to walk through an unsafe area at night. We present a novel multi-modal trip planner that allows users to upload auxiliary geographic data (e.g., crime rates) and to specify temporal constraints and preferences over these data in combination with typical metrics such as time and cost. Concretely, our planner supports the modes walking, biking, driving, public transit, and taxi, uses linear temporal logic to capture temporal constraints, and preferential cost functions to represent preferences. We show by examples that this allows the expression of very interesting preferences and constraints that, naturally, lead to quite diverse optimal plans.
AISep 23, 2019
Acceptable Planning: Influencing Individual Behavior to Reduce Transportation Energy Expenditure of a CityShiwali Mohan, Hesham Rakha, Matthew Klenk
Our research aims at developing intelligent systems to reduce the transportation-related energy expenditure of a large city by influencing individual behavior. We introduce COPTER - an intelligent travel assistant that evaluates multi-modal travel alternatives to find a plan that is acceptable to a person given their context and preferences. We propose a formulation for acceptable planning that brings together ideas from AI, machine learning, and economics. This formulation has been incorporated in COPTER that produces acceptable plans in real-time. We adopt a novel empirical evaluation framework that combines human decision data with a high fidelity multi-modal transportation simulation to demonstrate a 4\% energy reduction and 20\% delay reduction in a realistic deployment scenario in Los Angeles, California, USA.