LGDec 16, 2024

Multimodal LLM for Intelligent Transportation Systems

Dexter Le, Aybars Yunusoglu, Karn Tiwari, Murat Isik, I. Can Dikmen

arXiv:2412.11683v110.49 citationsh-index: 8

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of complex data processing workflows in transportation systems for researchers and practitioners, though it appears incremental by applying existing LLM methods to new multimodal data in this domain.

The paper tackles the challenge of integrating multiple machine learning models for intelligent transportation systems by proposing a single data-centric LLM framework that processes multimodal data like time series, images, and videos, achieving an average accuracy of 91.33% across datasets, with up to 92.7% on time-series data.

In the evolving landscape of transportation systems, integrating Large Language Models (LLMs) offers a promising frontier for advancing intelligent decision-making across various applications. This paper introduces a novel 3-dimensional framework that encapsulates the intersection of applications, machine learning methodologies, and hardware devices, particularly emphasizing the role of LLMs. Instead of using multiple machine learning algorithms, our framework uses a single, data-centric LLM architecture that can analyze time series, images, and videos. We explore how LLMs can enhance data interpretation and decision-making in transportation. We apply this LLM framework to different sensor datasets, including time-series data and visual data from sources like Oxford Radar RobotCar, D-Behavior (D-Set), nuScenes by Motional, and Comma2k19. The goal is to streamline data processing workflows, reduce the complexity of deploying multiple models, and make intelligent transportation systems more efficient and accurate. The study was conducted using state-of-the-art hardware, leveraging the computational power of AMD RTX 3060 GPUs and Intel i9-12900 processors. The experimental results demonstrate that our framework achieves an average accuracy of 91.33\% across these datasets, with the highest accuracy observed in time-series data (92.7\%), showcasing the model's proficiency in handling sequential information essential for tasks such as motion planning and predictive maintenance. Through our exploration, we demonstrate the versatility and efficacy of LLMs in handling multimodal data within the transportation sector, ultimately providing insights into their application in real-world scenarios. Our findings align with the broader conference themes, highlighting the transformative potential of LLMs in advancing transportation technologies.

View on arXiv PDF

Similar