LG AIAug 5, 2024

MDM: Advancing Multi-Domain Distribution Matching for Automatic Modulation Recognition Dataset Synthesis

Dongwei Xu, Jiajun Chen, Yao Lu, Tianhao Xia, Qi Xuan, Wei Wang, Yun Lin, Xiaoniu Yang

arXiv:2408.02714v12.6h-index: 18

Originality Incremental advance

AI Analysis

This addresses data storage and training challenges for AMR researchers, but it is incremental as it adapts existing data distillation techniques to the signal domain.

The paper tackles the problem of large-scale data requirements for deep learning in Automatic Modulation Recognition (AMR) by proposing MDM, a dataset distillation method that compresses data into smaller synthetic datasets while maintaining performance, achieving better results than baselines under the same compression ratio and showing good generalization across models.

Recently, deep learning technology has been successfully introduced into Automatic Modulation Recognition (AMR) tasks. However, the success of deep learning is all attributed to the training on large-scale datasets. Such a large amount of data brings huge pressure on storage, transmission and model training. In order to solve the problem of large amount of data, some researchers put forward the method of data distillation, which aims to compress large training data into smaller synthetic datasets to maintain its performance. While numerous data distillation techniques have been developed within the realm of image processing, the unique characteristics of signals set them apart. Signals exhibit distinct features across various domains, necessitating specialized approaches for their analysis and processing. To this end, a novel dataset distillation method--Multi-domain Distribution Matching (MDM) is proposed. MDM employs the Discrete Fourier Transform (DFT) to translate timedomain signals into the frequency domain, and then uses a model to compute distribution matching losses between the synthetic and real datasets, considering both the time and frequency domains. Ultimately, these two losses are integrated to update the synthetic dataset. We conduct extensive experiments on three AMR datasets. Experimental results show that, compared with baseline methods, our method achieves better performance under the same compression ratio. Furthermore, we conduct crossarchitecture generalization experiments on several models, and the experimental results show that our synthetic datasets can generalize well on other unseen models.

View on arXiv PDF

Similar