LGApr 4, 2022Code
Gan-Based Joint Activity Detection and Channel Estimation For Grant-free Random AccessShuang Liang, Yinan Zou, Yong Zhou
Joint activity detection and channel estimation (JADCE) for grant-free random access is a critical issue that needs to be addressed to support massive connectivity in IoT networks. However, the existing model-free learning method can only achieve either activity detection or channel estimation, but not both. In this paper, we propose a novel model-free learning method based on generative adversarial network (GAN) to tackle the JADCE problem. We adopt the U-net architecture to build the generator rather than the standard GAN architecture, where a pre-estimated value that contains the activity information is adopted as input to the generator. By leveraging the properties of the pseudoinverse, the generator is refined by using an affine projection and a skip connection to ensure the output of the generator is consistent with the measurement. Moreover, we build a two-layer fully-connected neural network to design pilot matrix for reducing the impact of receiver noise. Simulation results show that the proposed method outperforms the existing methods in high SNR regimes, as both data consistency projection and pilot matrix optimization improve the learning ability.
CVMay 25, 2025Code
Can Multimodal Large Language Models Understand Spatial Relations?Jingping Liu, Ziyan Liu, Zhedong Cen et al.
Spatial relation reasoning is a crucial task for multimodal large language models (MLLMs) to understand the objective world. However, current benchmarks have issues like relying on bounding boxes, ignoring perspective substitutions, or allowing questions to be answered using only the model's prior knowledge without image understanding. To address these issues, we introduce SpatialMQA, a human-annotated spatial relation reasoning benchmark based on COCO2017, which enables MLLMs to focus more on understanding images in the objective world. To ensure data quality, we design a well-tailored annotation procedure, resulting in SpatialMQA consisting of 5,392 samples. Based on this benchmark, a series of closed- and open-source MLLMs are implemented and the results indicate that the current state-of-the-art MLLM achieves only 48.14% accuracy, far below the human-level accuracy of 98.40%. Extensive experimental analyses are also conducted, suggesting the future research directions. The benchmark and codes are available at https://github.com/ziyan-xiaoyu/SpatialMQA.git.