HCDec 16, 2022
Semantics-Empowered Communication: A Tutorial-cum-SurveyZhilin Lu, Rongpeng Li, Kun Lu et al.
Along with the springing up of the semantics-empowered communication (SemCom) research, it is now witnessing an unprecedentedly growing interest towards a wide range of aspects (e.g., theories, applications, metrics and implementations) in both academia and industry. In this work, we primarily aim to provide a comprehensive survey on both the background and research taxonomy, as well as a detailed technical tutorial. Specifically, we start by reviewing the literature and answering the "what" and "why" questions in semantic transmissions. Afterwards, we present the ecosystems of SemCom, including history, theories, metrics, datasets and toolkits, on top of which the taxonomy for research directions is presented. Furthermore, we propose to categorize the critical enabling techniques by explicit and implicit reasoning-based methods, and elaborate on how they evolve and contribute to modern content & channel semantics-empowered communications. Besides reviewing and summarizing the latest efforts in SemCom, we discuss the relations with other communication levels (e.g., conventional communications) from a holistic and unified viewpoint. Subsequently, in order to facilitate future developments and industrial applications, we also highlight advanced practical techniques for boosting semantic accuracy, robustness, and large-scale scalability, just to mention a few. Finally, we discuss the technical challenges that shed light on future research opportunities.
ITNov 5, 2022Code
Quantization Adaptor for Bit-Level Deep Learning-Based Massive MIMO CSI FeedbackXudong Zhang, Zhilin Lu, Rui Zeng et al.
In massive multiple-input multiple-output (MIMO) systems, the user equipment (UE) needs to feed the channel state information (CSI) back to the base station (BS) for the following beamforming. But the large scale of antennas in massive MIMO systems causes huge feedback overhead. Deep learning (DL) based methods can compress the CSI at the UE and recover it at the BS, which reduces the feedback cost significantly. But the compressed CSI must be quantized into bit streams for transmission. In this paper, we propose an adaptor-assisted quantization strategy for bit-level DL-based CSI feedback. First, we design a network-aided adaptor and an advanced training scheme to adaptively improve the quantization and reconstruction accuracy. Moreover, for easy practical employment, we introduce the expert knowledge of data distribution and propose a pluggable and cost-free adaptor scheme. Experiments show that compared with the state-of-the-art feedback quantization method, this adaptor-aided quantization strategy can achieve better quantization accuracy and reconstruction performance with less or no additional cost. The open-source codes are available at https://github.com/zhang-xd18/QCRNet.
ITOct 29, 2022
Better Lightweight Network for Free: Codeword Mimic Learning for Massive MIMO CSI feedbackZhilin Lu, Xudong Zhang, Rui Zeng et al.
The channel state information (CSI) needs to be fed back from the user equipment (UE) to the base station (BS) in frequency division duplexing (FDD) multiple-input multiple-output (MIMO) system. Recently, neural networks are widely applied to CSI compressed feedback since the original overhead is too large for the massive MIMO system. Notably, lightweight feedback networks attract special attention due to their practicality of deployment. However, the feedback accuracy is likely to be harmed by the network compression. In this paper, a cost free distillation technique named codeword mimic (CM) is proposed to train better feedback networks with the practical lightweight encoder. A mimic-explore training strategy with a special distillation scheduler is designed to enhance the CM learning. Experiments show that the proposed CM learning outperforms the previous state-of-the-art feedback distillation method, boosting the performance of the lightweight feedback network without any extra inference cost.
ITFeb 5, 2023
Towards Efficient Subarray Hybrid Beamforming: Attention Network-based Practical Feedback in FDD Massive MU-MIMO SystemsZhilin Lu, Xudong Zhang, Rui Zeng et al.
Channel state information (CSI) feedback is necessary for the frequency division duplexing (FDD) multiple input multiple output (MIMO) systems due to the channel non-reciprocity. With the help of deep learning, many works have succeeded in rebuilding the compressed ideal CSI for massive MIMO. However, simple CSI reconstruction is of limited practicality since the channel estimation and the targeted beamforming design are not considered. In this paper, a jointly optimized network is introduced for channel estimation and feedback so that a spectral-efficient beamformer can be learned. Moreover, the deployment-friendly subarray hybrid beamforming architecture is applied and a practical lightweight end-to-end network is specially designed. Experiments show that the proposed network is over 10 times lighter at the resource-sensitive user equipment compared with the previous state-of-the-art method with only a minor performance loss.
SPJan 2, 2024Code
Enhancing Automatic Modulation Recognition through Robust Global Feature ExtractionYunpeng Qu, Zhilin Lu, Rui Zeng et al.
Automatic Modulation Recognition (AMR) plays a crucial role in wireless communication systems. Deep learning AMR strategies have achieved tremendous success in recent years. Modulated signals exhibit long temporal dependencies, and extracting global features is crucial in identifying modulation schemes. Traditionally, human experts analyze patterns in constellation diagrams to classify modulation schemes. Classical convolutional-based networks, due to their limited receptive fields, excel at extracting local features but struggle to capture global relationships. To address this limitation, we introduce a novel hybrid deep framework named TLDNN, which incorporates the architectures of the transformer and long short-term memory (LSTM). We utilize the self-attention mechanism of the transformer to model the global correlations in signal sequences while employing LSTM to enhance the capture of temporal dependencies. To mitigate the impact like RF fingerprint features and channel characteristics on model generalization, we propose data augmentation strategies known as segment substitution (SS) to enhance the model's robustness to modulation-related features. Experimental results on widely-used datasets demonstrate that our method achieves state-of-the-art performance and exhibits significant advantages in terms of complexity. Our proposed framework serves as a foundational backbone that can be extended to different datasets. We have verified the effectiveness of our augmentation approach in enhancing the generalization of the models, particularly in few-shot scenarios. Code is available at \url{https://github.com/AMR-Master/TLDNN}.
ITFeb 15, 2023
Deep Learning for Hybrid Beamforming with Finite Feedback in GSM Aided mmWave MIMO SystemsZhilin Lu, Xudong Zhang, Rui Zeng et al.
Hybrid beamforming is widely recognized as an important technique for millimeter wave (mmWave) multiple input multiple output (MIMO) systems. Generalized spatial modulation (GSM) is further introduced to improve the spectrum efficiency. However, most of the existing works on beamforming assume the perfect channel state information (CSI), which is unrealistic in practical systems. In this paper, joint optimization of downlink pilot training, channel estimation, CSI feedback, and hybrid beamforming is considered in GSM aided frequency division duplexing (FDD) mmWave MIMO systems. With the help of deep learning, the GSM hybrid beamformers are designed via unsupervised learning in an end-to-end way. Experiments show that the proposed multi-resolution network named GsmEFBNet can reach a better achievable rate with fewer feedback bits compared with the conventional algorithm.
CRSep 16, 2021Code
OpenFed: A Comprehensive and Versatile Open-Source Federated Learning FrameworkDengsheng Chen, Vince Tan, Zhilin Lu et al.
Recent developments in Artificial Intelligence techniques have enabled their successful application across a spectrum of commercial and industrial settings. However, these techniques require large volumes of data to be aggregated in a centralized manner, forestalling their applicability to scenarios wherein the data is sensitive or the cost of data transmission is prohibitive. Federated Learning alleviates these problems by decentralizing model training, thereby removing the need for data transfer and aggregation. To advance the adoption of Federated Learning, more research and development needs to be conducted to address some important open questions. In this work, we propose OpenFed, an open-source software framework for end-to-end Federated Learning. OpenFed reduces the barrier to entry for both researchers and downstream users of Federated Learning by the targeted removal of existing pain points. For researchers, OpenFed provides a framework wherein new methods can be easily implemented and fairly evaluated against an extensive suite of benchmarks. For downstream users, OpenFed allows Federated Learning to be plugged and play within different subject-matter contexts, removing the need for deep expertise in Federated Learning.
ITNov 5, 2020Code
Binary Neural Network Aided CSI Feedback in Massive MIMO SystemZhilin Lu, Jintao Wang, Jian Song
In massive multiple-input multiple-output (MIMO) system, channel state information (CSI) is essential for the base station to achieve high performance gain. Recently, deep learning is widely used in CSI compression to fight against the growing feedback overhead brought by massive MIMO in frequency division duplexing system. However, applying neural network brings extra memory and computation cost, which is non-negligible especially for the resource limited user equipment (UE). In this paper, a novel binarization aided feedback network named BCsiNet is introduced. Moreover, BCsiNet variants are designed to boost the performance under customized training and inference schemes. Experiments shows that BCsiNet offers over 30$\times$ memory saving and around 2$\times$ inference acceleration for encoder at UE compared with CsiNet. Furthermore, the feedback performance of BCsiNet is comparable with original CsiNet. The key results can be reproduced with https://github.com/Kylin9511/BCsiNet.
ITOct 31, 2019Code
Multi-resolution CSI Feedback with deep learning in Massive MIMO SystemZhilin Lu, Jintao Wang, Jian Song
In massive multiple-input multiple-output (MIMO) system, user equipment (UE) needs to send downlink channel state information (CSI) back to base station (BS). However, the feedback becomes expensive with the growing complexity of CSI in massive MIMO system. Recently, deep learning (DL) approaches are used to improve the reconstruction efficiency of CSI feedback. In this paper, a novel feedback network named CRNet is proposed to achieve better performance via extracting CSI features on multiple resolutions. An advanced training scheme that further boosts the network performance is also introduced. Simulation results show that the proposed CRNet outperforms the state-of-the-art CsiNet under the same computational complexity without any extra information. The open source codes are available at https://github.com/Kylin9511/CRNet
CVNov 23, 2025
MammothModa2: A Unified AR-Diffusion Framework for Multimodal Understanding and GenerationTao Shen, Xin Wan, Taicai Chen et al.
Unified multimodal models aim to integrate understanding and generation within a single framework, yet bridging the gap between discrete semantic reasoning and high-fidelity visual synthesis remains challenging. We present MammothModa2 (Mammoth2), a unified autoregressive-diffusion (AR-Diffusion) framework designed to effectively couple autoregressive semantic planning with diffusion-based generation. Mammoth2 adopts a serial design: an AR path equipped with generation experts performs global semantic modeling over discrete tokens, while a single-stream Diffusion Transformer (DiT) decoder handles high-fidelity image synthesis. A carefully designed AR-Diffusion feature alignment module combines multi-layer feature aggregation, unified condition encoding, and in-context conditioning to stably align AR's representations with the diffusion decoder's continuous latents. Mammoth2 is trained end-to-end with joint Next-Token Prediction and Flow Matching objectives, followed by supervised fine-tuning and reinforcement learning over both generation and editing. With roughly 60M supervised generation samples and no reliance on pre-trained generators, Mammoth2 delivers strong text-to-image and instruction-based editing performance on public benchmarks, achieving 0.87 on GenEval, 87.2 on DPGBench, and 4.06 on ImgEdit, while remaining competitive with understanding-only backbones (e.g., Qwen3-VL-8B) on multimodal understanding tasks. These results suggest that a carefully coupled AR-Diffusion architecture can provide high-fidelity generation and editing while maintaining strong multimodal comprehension within a single, parameter- and data-efficient model.
CVDec 19, 2021
Elastic-Link for Binarized Neural NetworkJie Hu, Ziheng Wu, Vince Tan et al.
Recent work has shown that Binarized Neural Networks (BNNs) are able to greatly reduce computational costs and memory footprints, facilitating model deployment on resource-constrained devices. However, in comparison to their full-precision counterparts, BNNs suffer from severe accuracy degradation. Research aiming to reduce this accuracy gap has thus far largely focused on specific network architectures with few or no 1x1 convolutional layers, for which standard binarization methods do not work well. Because 1x1 convolutions are common in the design of modern architectures (e.g. GoogleNet, ResNet, DenseNet), it is crucial to develop a method to binarize them effectively for BNNs to be more widely adopted. In this work, we propose an "Elastic-Link" (EL) module to enrich information flow within a BNN by adaptively adding real-valued input features to the subsequent convolutional output features. The proposed EL module is easily implemented and can be used in conjunction with other methods for BNNs. We demonstrate that adding EL to BNNs produces a significant improvement on the challenging large-scale ImageNet dataset. For example, we raise the top-1 accuracy of binarized ResNet26 from 57.9% to 64.0%. EL also aids convergence in the training of binarized MobileNet, for which a top-1 accuracy of 56.4% is achieved. Finally, with the integration of ReActNet, it yields a new state-of-the-art result of 71.9% top-1 accuracy.
ITMay 1, 2021
Binarized Aggregated Network with Quantization: Flexible Deep Learning Deployment for CSI Feedback in Massive MIMO SystemZhilin Lu, Xudong Zhang, Hongyi He et al.
Massive multiple-input multiple-output (MIMO) is one of the key techniques to achieve better spectrum and energy efficiency in 5G system. The channel state information (CSI) needs to be fed back from the user equipment to the base station in frequency division duplexing (FDD) mode. However, the overhead of the direct feedback is unacceptable due to the large antenna array in massive MIMO system. Recently, deep learning is widely adopted to the compressed CSI feedback task and proved to be effective. In this paper, a novel network named aggregated channel reconstruction network (ACRNet) is designed to boost the feedback performance with network aggregation and parametric rectified linear unit (PReLU) activation. The practical deployment of the feedback network in the communication system is also considered. Specifically, the elastic feedback scheme is proposed to flexibly adapt the network to meet different resource limitations. Besides, the network binarization technique is combined with the feature quantization for lightweight and practical deployment. Experiments show that the proposed ACRNet outperforms loads of previous state-of-the-art networks, providing a neat feedback solution with high performance, low cost and impressive flexibility.
CVMar 19, 2021
Learning the Superpixel in a Non-iterative and Lifelong MannerLei Zhu, Qi She, Bin Zhang et al.
Superpixel is generated by automatically clustering pixels in an image into hundreds of compact partitions, which is widely used to perceive the object contours for its excellent contour adherence. Although some works use the Convolution Neural Network (CNN) to generate high-quality superpixel, we challenge the design principles of these networks, specifically for their dependence on manual labels and excess computation resources, which limits their flexibility compared with the traditional unsupervised segmentation methods. We target at redefining the CNN-based superpixel segmentation as a lifelong clustering task and propose an unsupervised CNN-based method called LNS-Net. The LNS-Net can learn superpixel in a non-iterative and lifelong manner without any manual labels. Specifically, a lightweight feature embedder is proposed for LNS-Net to efficiently generate the cluster-friendly features. With those features, seed nodes can be automatically assigned to cluster pixels in a non-iterative way. Additionally, our LNS-Net can adapt the sequentially lifelong learning by rescaling the gradient of weight based on both channel and spatial context to avoid overfitting. Experiments show that the proposed LNS-Net achieves significantly better performance on three benchmarks with nearly ten times lower complexity compared with other state-of-the-art methods.
ITJan 17, 2021
Aggregated Network for Massive MIMO CSI FeedbackZhilin Lu, Hongyi He, Zhengyang Duan et al.
In frequency division duplexing (FDD) mode, it is necessary to send the channel state information (CSI) from user equipment to base station. The downlink CSI is essential for the massive multiple-input multiple-output (MIMO) system to acquire the potential gain. Recently, deep learning is widely adopted to massive MIMO CSI feedback task and proved to be effective compared with traditional compressed sensing methods. In this paper, a novel network named ACRNet is designed to boost the feedback performance with network aggregation and parametric RuLU activation. Moreover, valid approach to expand the network architecture in exchange of better performance is first discussed in CSI feedback task. Experiments show that ACRNet outperforms loads of previous state-of-the-art feedback networks without any extra information.