SPCVLGOct 15, 2024

Multi-modal Image and Radio Frequency Fusion for Optimizing Vehicle Positioning

arXiv:2410.19788v13 citationsh-index: 26IEEE Trans Mob Comput
Originality Incremental advance
AI Analysis

This work addresses vehicle localization for outdoor communication systems, offering a domain-specific improvement by integrating multi-modal data.

The paper tackles vehicle positioning by fusing channel state information (CSI) and images in an outdoor scenario, using a meta-learning based hard EM algorithm to exploit unlabeled data, resulting in up to a 61% reduction in positioning error compared to a CSI-only baseline.

In this paper, a multi-modal vehicle positioning framework that jointly localizes vehicles with channel state information (CSI) and images is designed. In particular, we consider an outdoor scenario where each vehicle can communicate with only one BS, and hence, it can upload its estimated CSI to only its associated BS. Each BS is equipped with a set of cameras, such that it can collect a small number of labeled CSI, a large number of unlabeled CSI, and the images taken by cameras. To exploit the unlabeled CSI data and position labels obtained from images, we design an meta-learning based hard expectation-maximization (EM) algorithm. Specifically, since we do not know the corresponding relationship between unlabeled CSI and the multiple vehicle locations in images, we formulate the calculation of the training objective as a minimum matching problem. To reduce the impact of label noises caused by incorrect matching between unlabeled CSI and vehicle locations obtained from images and achieve better convergence, we introduce a weighted loss function on the unlabeled datasets, and study the use of a meta-learning algorithm for computing the weighted loss. Subsequently, the model parameters are updated according to the weighted loss function of unlabeled CSI samples and their matched position labels obtained from images. Simulation results show that the proposed method can reduce the positioning error by up to 61% compared to a baseline that does not use images and uses only CSI fingerprint for vehicle positioning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes