Zhengyuan Wu

2papers

2 Papers

CVNov 3, 2023Code
PILL: Plug Into LLM with Adapter Expert and Attention Gate

Fangyuan Zhang, Tingting Liang, Zhengyuan Wu et al.

Due to the remarkable capabilities of powerful Large Language Models (LLMs) in effectively following instructions, there has been a growing number of assistants in the community to assist humans. Recently, significant progress has been made in the development of Vision Language Models (VLMs), expanding the capabilities of LLMs and enabling them to execute more diverse instructions. However, it is foreseeable that models will likely need to handle tasks involving additional modalities such as speech, video, and others. This poses a particularly prominent challenge of dealing with the complexity of mixed modalities. To address this, we introduce a novel architecture called PILL: Plug Into LLM with adapter expert and attention gate to better decouple these complex modalities and leverage efficient fine-tuning. We introduce two modules: Firstly, utilizing Mixture-of-Modality-Adapter-Expert to independently handle different modalities, enabling better adaptation to downstream tasks while preserving the expressive capability of the original model. Secondly, by introducing Modality-Attention-Gating, which enables adaptive control of the contribution of modality tokens to the overall representation. In addition, we have made improvements to the Adapter to enhance its learning and expressive capabilities. Experimental results demonstrate that our approach exhibits competitive performance compared to other mainstream methods for modality fusion. For researchers interested in our work, we provide free access to the code and models at https://github.com/DsaltYfish/PILL.

15.1SYApr 28
Dual-Polarized Massive MIMO Based on Precoding for Vehicle-To-Ground Communication in Urban Rail Transit

Zhengyuan Wu, Junhui Zhao, Qingmiao Zhang et al.

The development of intelligent and diversified ser vices in urban rail transit (URT) has resulted in an increasing de mand for high-rate communication between vehicles and ground equipment. However, existing URT communication systems strug gle to handle the massive data exchange required for vehicle-to ground (V2G) communication. To address this issue, we propose a distributed dual-polarized MIMO architecture suitable for URT tunnel scenarios. Specifically, the channel model is based on spatial three-dimensional (3D) non-stationary geometry-based stochastic model (GBSM), which takes into account the geometric distribution of URT tunnels and the cross-polarization effects between dual-polarized antennas. For dual-polarized MIMO systems, the polarized-aware sparse channel estimation (PASCE) method is proposed for effective channel estimation. Additionally, we derive closed-form expressions for the MMSE and MR precoding schemes. The polarized-aware dynamic interference cancellation (PADIC) algorithm is developed to eliminate in terference between different polarization modes and multiple users. The simulation results demonstrate that the proposed dual-polarized precoding algorithm can withstand high cross polarization correlation (XPC) and improve the efficiency of V2G communication to achieve high rates.