HybridVFL: Disentangled Feature Learning for Edge-Enabled Vertical Federated Multimodal Classification
This work addresses performance bottlenecks in privacy-preserving multimodal classification for edge-enabled applications such as mobile health diagnostics, representing an incremental improvement in feature fusion methods.
The paper tackled the performance limitations of standard Vertical Federated Learning (VFL) in edge AI scenarios like mobile health diagnostics by introducing HybridVFL, a framework using client-side feature disentanglement and server-side cross-modal transformer fusion, and demonstrated significant outperformance over baselines on the HAM10000 skin lesion dataset.
Vertical Federated Learning (VFL) offers a privacy-preserving paradigm for Edge AI scenarios like mobile health diagnostics, where sensitive multimodal data reside on distributed, resource-constrained devices. Yet, standard VFL systems often suffer performance limitations due to simplistic feature fusion. This paper introduces HybridVFL, a novel framework designed to overcome this bottleneck by employing client-side feature disentanglement paired with a server-side cross-modal transformer for context-aware fusion. Through systematic evaluation on the multimodal HAM10000 skin lesion dataset, we demonstrate that HybridVFL significantly outperforms standard federated baselines, validating the criticality of advanced fusion mechanisms in robust, privacy-preserving systems.