Rahim Tafazolli

h-index38

23papers

123citations

Novelty46%

AI Score52

Ranked #34,070 of 201,326 authors (top 17%)#88 in IT (top 10%)

23 Papers

ITMay 30

Hybrid Bit and Semantic Communications for UAV-Enabled Wireless Power Transfer Networks: A Decision-Assisted Deep Reinforcement Learning Approach

Jingfu Li, Jingjing Cui, Chong Huang et al.

Semantic communications which can significantly reduce spectrum consumption in wireless networks, have recently become a popular research area. When combined with wireless power transfer (WPT), semantic communications can help achieve high spectral efficiency for energy-limited devices in wireless communications. In energy-constrained and link budget-limited scenarios such as UAV networks, the integration of semantic communications and WPT enables highly energyefficient transmission mechanisms. In this paper, we investigate semantic communications in UAV-enabled WPT networks. To achieve adaptability to varying signal-to-noise ratio (SNR) and task requirements, we introduce a multi-layer hybrid bit and semantic communication framework. We adopt a semantic communication efficiency metric and aim to maximize it by jointly optimizing UAV trajectory, energy harvesting base station (EHBS) selection, user association, semantic mode selection, and energy harvesting time allocation. To address this complex longterm optimization problem, we introduce the distributional soft actor-critic (DSAC) algorithm and introduce a decision assistant to further enhance the convergence performance of DSAC. Simulation results validate the effectiveness of the proposed method and framework and demonstrate that our algorithm can achieve superior long-term optimization performance in dynamic network environments.

ITMay 24

Secure Semantic Communication over Wiretap Channels: Rate-Distortion-Equivocation Tradeoff

Denis Kozlov, Mahtab Mirmohseni, Rahim Tafazolli

This paper investigates an information-theoretic model of secure semantic-aware communication. For this purpose, we consider the lossy joint source-channel coding (JSCC) of a memoryless semantic source transmitted over a memoryless wiretap channel. The source consists of two correlated parts that represent semantic and observed aspects of the information. Our model assumes separate fidelity and secrecy constraints on each source component and, in addition, encompasses two cases for the source output, in order to evaluate the performance gains if the encoder has an extended access to the source. Specifically, in Case 1, the encoder has direct access only to the samples from a single (observed) source component, while in Case 2 it has additional direct access to the samples of the underlying semantic information. We derive single-letter converse and achievability bounds on the rate-distortion-equivocation region. The converse bound explicitly contains rate-distortion functions, making it easy to evaluate, especially for some common distributions. The proposed achievability coding scheme involves novel stochastic superposition coding with two private parts to enable analysis of the equivocation for each source component, separately. Our results generalise some of the previously established source and source-channel coding problems. The general results are further specialised to Gaussian and Bernoulli sources transmitted over Gaussian and binary wiretap channels, respectively. The numerical evaluations illustrate the derived bounds for these distributions.

ITMar 3, 2023

Collaborative Learning with a Drone Orchestrator

Mahdi Boloursaz Mashhadi, Mahnoosh Mahdavimoghadam, Rahim Tafazolli et al.

In this paper, the problem of drone-assisted collaborative learning is considered. In this scenario, swarm of intelligent wireless devices train a shared neural network (NN) model with the help of a drone. Using its sensors, each device records samples from its environment to gather a local dataset for training. The training data is severely heterogeneous as various devices have different amount of data and sensor noise level. The intelligent devices iteratively train the NN on their local datasets and exchange the model parameters with the drone for aggregation. For this system, the convergence rate of collaborative learning is derived while considering data heterogeneity, sensor noise levels, and communication errors, then, the drone trajectory that maximizes the final accuracy of the trained NN is obtained. The proposed trajectory optimization approach is aware of both the devices data characteristics (i.e., local dataset size and noise level) and their wireless channel conditions, and significantly improves the convergence rate and final accuracy in comparison with baselines that only consider data characteristics or channel conditions. Compared to state-of-the-art baselines, the proposed approach achieves an average 3.85% and 3.54% improvement in the final accuracy of the trained NN on benchmark datasets for image recognition and semantic segmentation tasks, respectively. Moreover, the proposed framework achieves a significant speedup in training, leading to an average 24% and 87% saving in the drone hovering time, communication overhead, and battery usage, respectively for these tasks.

NIApr 14

Agentic AI for 6G: A New Paradigm for Autonomous RAN Security Compliance

Sotiris Chatzimiltis, Mahdi Boloursaz Mashhadi, Mohammad Shojafar et al.

Agentic AI systems are emerging as powerful tools for automating complex, multi-step tasks across various industries. One such industry is telecommunications, where the growing complexity of next-generation radio access networks (RANs) opens up numerous opportunities for applying these systems. Securing the RAN is a key area, particularly through automating the security compliance process, as traditional methods often struggle to keep pace with evolving specifications and real-time changes. In this article, we propose a framework that leverages LLM-based AI agents integrated with a retrieval-augmented generation (RAG) pipeline to enable intelligent and autonomous enforcement of security compliance. An initial case study demonstrates how an agent can assess configuration files for compliance with O-RAN Alliance and 3GPP standards, generate explainable justifications, and propose automated remediation if needed. We also highlight key challenges such as model hallucinations and vendor inconsistencies, along with considerations like agent security, transparency, and system trust. Finally, we outline future directions, emphasizing the need for telecom-specific LLMs and standardized evaluation frameworks.

LGSep 14, 2022

Convergence Acceleration in Wireless Federated Learning: A Stackelberg Game Approach

Kaidi Wang, Yi Ma, Mahdi Boloursaz Mashhadi et al.

This paper studies issues that arise with respect to the joint optimization for convergence time in federated learning over wireless networks (FLOWN). We consider the criterion and protocol for selection of participating devices in FLOWN under the energy constraint and derive its impact on device selection. In order to improve the training efficiency, age-of-information (AoI) enables FLOWN to assess the freshness of gradient updates among participants. Aiming to speed up convergence, we jointly investigate global loss minimization and latency minimization in a Stackelberg game based framework. Specifically, we formulate global loss minimization as a leader-level problem for reducing the number of required rounds, and latency minimization as a follower-level problem to reduce time consumption of each round. By decoupling the follower-level problem into two sub-problems, including resource allocation and sub-channel assignment, we achieve an optimal strategy of the follower through monotonic optimization and matching theory. At the leader-level, we derive an upper bound of convergence rate and subsequently reformulate the global loss minimization problem and propose a new age-of-update (AoU) based device selection algorithm. Simulation results indicate the superior performance of the proposed AoU based device selection scheme in terms of the convergence rate, as well as efficient utilization of available sub-channels.

CRMay 6Code

Secure Intellicise Wireless Network: Agentic AI for Coverless Semantic Steganography Communication

Rui Meng, Song Gao, Bingxuan Xu et al.

Semantic Communication (SemCom), leveraging its significant advantages in transmission efficiency and reliability, has emerged as a core technology for constructing future intellicise (intelligent and concise) wireless networks. However, intelligent attacks represented by semantic eavesdropping pose severe challenges to the security of SemCom. To address this challenge, Semantic Steganographic Communication (SemSteCom) achieves ``invisible'' encryption by implicitly embedding private semantic information into cover modality carriers. The state-of-the-art study has further introduced generative diffusion models to directly generate stega images without relying on original cover images, effectively enhancing steganographic capacity. Nevertheless, the recovery process of private images is highly dependent on the guidance of private semantic keys, which may be inferred by intelligent eavesdroppers, thereby introducing new security threats. To address this issue, we propose an Agentic AI-driven SemSteCom (AgentSemSteCom) scheme, which includes semantic extraction, digital token controlled reference image generation, coverless steganography, semantic codec, and optional task-oriented enhancement modules. The proposed AgentSemSteCom scheme obviates the need for both cover images and private semantic keys, thereby boosting steganographic capacity while reinforcing transmission security. The simulation results on open-source datasets verify that, AgentSemSteCom achieves better transmission quality and higher security levels than the baseline scheme.

ITMar 2

Video TokenCom: Textual Intent-Guided Multi-Rate Video Token Communications with UEP-Based Adaptive Source-Channel Coding

Jingxuan Men, Mahdi Boloursaz Mashhadi, Ning Wang et al.

Token Communication (TokenCom) is a new paradigm, motivated by the recent success of Large AI Models (LAMs) and Multimodal Large Language Models (MLLMs), where tokens serve as unified units of communication and computation, enabling efficient semantic- and goal-oriented information exchange in future wireless networks. In this paper, we propose a novel Video TokenCom framework for textual intent-guided multi-rate video communication with Unequal Error Protection (UEP)-based source-channel coding adaptation. The proposed framework integrates user-intended textual descriptions with discrete video tokenization and unequal error protection to enhance semantic fidelity under restrictive bandwidth constraints. First, discrete video tokens are extracted through a pretrained video tokenizer, while text-conditioned vision-language modeling and optical-flow propagation are jointly used to identify tokens that correspond to user-intended semantics across space and time. Next, we introduce a semantic-aware multi-rate bit-allocation strategy, in which tokens highly related to the user intent are encoded using full codebook precision, whereas non-intended tokens are represented through reduced codebook precision differential encoding, enabling rate savings while preserving semantic quality. Finally, a source and channel coding adaptation scheme is developed to adapt bit allocation and channel coding to varying resources and link conditions. Experiments on various video datasets demonstrate that the proposed framework outperforms both conventional and semantic communication baselines, in perceptual and semantic quality on a wide SNR range.

ITApr 21

Reliable Remote Inference from Unreliable Components: Joint Communication and Computation Limits

Zhenyu Liu, Yi Ma, Rahim Tafazolli

Classical information theory typically assumes reliable receiver-side processing. We study remote inference when communication is noisy and the receiver itself is built from unreliable components under a finite redundancy budget. Under a committed/no-bypass receiver closure, task-relevant information can affect the final estimate only by passing through a budgeted collection of vulnerable primitives unless an explicit protected bypass is modeled. Modeling each vulnerable primitive as a memoryless noisy channel yields a baseline supply--demand converse: the task-relevant information needed to attain a target distortion cannot exceed the smaller of the total information supplied by the communication channel and the total information supplied by the vulnerable compute budget. Our main converse shows that committed intermediate interfaces create additional first-order serial cuts and receiver-internal computation-graph cuts, captured in general by a receiver-internal compute min-cut converse. In particular, the twofold loss in the symmetric two-stage hard-separation special case is not inherent to unreliable receiver computation but induced by hard-separation under the committed/no-bypass closure. This extra first-order tax is therefore closure-dependent rather than universal. On the converse side, if downstream modules retain soft visibility to the raw channel output, the converse reduces to the single-bottleneck supply, up to any explicitly reserved soft-path budget. Under a separate stronger protected-support closure with reliable decoder and control support, we establish achievability results for task-direct and serial hard-separation constructions. For the fully noisy-logic regime, we obtain only a conservative depth-dependent converse, and matched achievability remains open.

SPApr 20

Low-Complexity Tone Injection via Candidate Ranking for PAPR Reduction in OFDM and AFDM Systems

Yupeng Zheng, Ang Li, Jinfei Wang et al.

Tone injection (TI) is a promising distortionless PAPR reduction technique that incurs no spectral efficiency loss. However, state-of-the-art TI schemes based on random candidate generation or clipping noise spectrum suffer from fundamental limitations in PAPR performance. In this paper, we propose novel TI schemes compatible with both OFDM and AFDM systems. The proposed schemes iteratively update the TI sequence via a candidate ranking procedure guided by time-domain local peaks. This accurately selects effective candidates while achieving a complexity comparable to that of the fast Fourier transform. Depth-first search is further integrated to enhance PAPR performance by exploiting the tree structure of the process. Simulations demonstrate that the proposed schemes achieve over 1 dB PAPR gain over baseline TI schemes at comparable complexity. The gain is consistent across various numbers of subcarriers under controlled per-iteration complexities, confirming a superior performance-complexity trade-off for both OFDM and AFDM.

LGFeb 12

Wireless TokenCom: RL-Based Tokenizer Agreement for Multi-User Wireless Token Communications

Farshad Zeinali, Mahdi Boloursaz Mashhadi, Dusit Niyato et al.

Token Communications (TokenCom) has recently emerged as an effective new paradigm, where tokens are the unified units of multimodal communications and computations, enabling efficient digital semantic- and goal-oriented communications in future wireless networks. To establish a shared semantic latent space, the transmitters/receivers in TokenCom need to agree on an identical tokenizer model and codebook. To this end, an initial Tokenizer Agreement (TA) process is carried out in each communication episode, where the transmitter/receiver cooperate to choose from a set of pre-trained tokenizer models/ codebooks available to them both for efficient TokenCom. In this correspondence, we investigate TA in a multi-user downlink wireless TokenCom scenario, where the base station equipped with multiple antennas transmits video token streams to multiple users. We formulate the corresponding mixed-integer non-convex problem, and propose a hybrid reinforcement learning (RL) framework that integrates a deep Q-network (DQN) for joint tokenizer agreement and sub-channel assignment, with a deep deterministic policy gradient (DDPG) for beamforming. Simulation results show that the proposed framework outperforms baseline methods in terms of semantic quality and resource efficiency, while reducing the freezing events in video transmission by 68% compared to the conventional H.265-based scheme.

ITApr 4

Region-Based Constellation Designs for Constructive Interference Precoding in MU-MIMO

Yupeng Zheng, Chunmei Xu, Jinfei Wang et al.

The performance of constructive interference precoding (CIP) for multi-user multi-antenna (MU-MIMO) systems is governed by the structure of the constructive interference (CI) regions, yet this is overlooked in conventional constellation design. This work proposes the region-based constellation (RBC) model to lay the foundation for CIP constellation design. An RBC directly defines the mapping between messages and their feasible regions, instead of deriving them from an existing constellation. To provide insight for RBC design, we study the limitations of quadrature-amplitude-modulation (QAM)-based CIP. Analytical results show that the restrictive CI regions of QAM symbols are systematically misaligned with the objective-minimising sign pattern, resulting in a significant gap to the theoretical performance limit. From the perspective of improving sign alignment, two novel RBC schemes with non-convex feasible regions are proposed, namely mirrored-ends QAM (ME-QAM) and real-extended ME-QAM. A low-complexity algorithm is also developed for the resulting mixed-integer quadratic program, achieving a complexity comparable to QAM-based CIP. Simulation results with constellation sizes $\{16,64\}$ demonstrate up to $4$~dB signal-to-noise-ratio gain of the proposed schemes over QAM-based CIP. The proposed RBC model is also applicable to other systems with non-bijective modulation, representing a promising direction for future research.

ITMay 14

Secure Joint Source-Channel Coding of Multimodal Semantic Sources

Denis Kozlov, Mahtab Mirmohseni, Rahim Tafazolli

We study the problem of secure joint source-channel coding for multimodal semantic sources transmitted over noisy wiretap channels. The source model consists of $m$ modalities (e.g., image, audio, and sensor data), all represented as random variables. The encoder observes independent and identically distributed samples of an arbitrary non-empty subset of modalities. The samples are encoded and transmitted over a discrete memoryless wiretap channel. The legitimate receiver reconstructs all modalities. We extend the rate-distortion-perception problem formulation to multimodal sources. We establish converse and achievability bounds on the fundamental limits of transmission rate, fidelity, and secrecy, under per-modality distortion and perception constraints, and per-subset equivocation constraints. We show that the fundamental limit for secrecy consists of three operationally distinct components: the level of compression, the secret key rate, and the statistics of the wiretap channel.

SYMay 4

Executor-Side Progressive Risk-Gated Actuation for Agentic AI in Wireless Supervisory Control

Zhenyu Liu, Yi Ma, Rahim Tafazolli

Agentic artificial intelligence (AI) shows promise for automating O-RAN wireless supervisory control, but translated intents still require an executor-side decision before live network actuation. Existing control flows lack explicit semantics for whether an intent should commit, gate for evidence, or reject under stale telemetry, concurrent policies, deadline and bandwidth limits, and rollback constraints. We propose Progressive Risk-Gated Actuation (PRGA), an executor-side contract for risk-gated wireless intent execution. PRGA structures each intent into executable local triage (C0), on-demand coordination evidence (C1), and post-hoc provenance support (C2), with C2 kept off the online safety path. A deterministic two-stage policy checks expiry, freshness, rollback-handle validity, local conflict, blocking preconditions, and planner-executor risk divergence from C0, then retrieves C1 only for gated intents when deadline and bandwidth budgets allow; evidence-mandatory gates reject when required C1 is unavailable. On two 3GPP-parameterized energy-saving and slice-SLA benchmarks, PRGA reduces time-to-first-safe-action by 23.3-27.4% and per-commit control-plane bytes by 52.7-54.2% against a decision-identical eager full-evidence cost-overlay comparator, thereby isolating retrieval-cost accounting; remains non-inferior within a pre-declared 0.5 percentage-point unsafe-action margin against an invariant-respecting static-threshold comparator; and rejects 100% of injected over-threshold stale inputs in the stale-state fault campaign. On these benchmarks, PRGA improves supervisory responsiveness and control-plane efficiency within the evaluated unsafe-action boundary.

MMFeb 17, 2025

Token Communications: A Large Model-Driven Framework for Cross-modal Context-aware Semantic Communications

Li Qiao, Mahdi Boloursaz Mashhadi, Zhen Gao et al.

In this paper, we introduce token communications (TokCom), a large model-driven framework to leverage cross-modal context information in generative semantic communications (GenSC). TokCom is a new paradigm, motivated by the recent success of generative foundation models and multimodal large language models (GFM/MLLMs), where the communication units are tokens, enabling efficient transformer-based token processing at the transmitter and receiver. In this paper, we introduce the potential opportunities and challenges of leveraging context in GenSC, explore how to integrate GFM/MLLMs-based token processing into semantic communication systems to leverage cross-modal context effectively at affordable complexity, present the key principles for efficient TokCom at various layers in future wireless networks. In a typical image semantic communication setup, we demonstrate a significant improvement of the bandwidth efficiency, achieved by TokCom by leveraging the context information among tokens. Finally, the potential research directions are identified to facilitate adoption of TokCom in future wireless networks.

CRJan 31, 2025

Secured Communication Schemes for UAVs in 5G: CRYSTALS-Kyber and IDS

Taneya Sharma, Seyed Ahmad Soleymani, Mohammad Shojafar et al.

This paper introduces a secure communication architecture for Unmanned Aerial Vehicles (UAVs) and ground stations in 5G networks, addressing critical challenges in network security. The proposed solution integrates the Advanced Encryption Standard (AES) with Elliptic Curve Cryptography (ECC) and CRYSTALS-Kyber for key encapsulation, offering a hybrid cryptographic approach. By incorporating CRYSTALS-Kyber, the framework mitigates vulnerabilities in ECC against quantum attacks, positioning it as a quantum-resistant alternative. The architecture is based on a server-client model, with UAVs functioning as clients and the ground station acting as the server. The system was rigorously evaluated in both VPN and 5G environments. Experimental results confirm that CRYSTALS-Kyber delivers strong protection against quantum threats with minimal performance overhead, making it highly suitable for UAVs with resource constraints. Moreover, the proposed architecture integrates an Artificial Intelligence (AI)-based Intrusion Detection System (IDS) to further enhance security. In performance evaluations, the IDS demonstrated strong results across multiple models with XGBoost, particularly in more demanding scenarios, outperforming other models with an accuracy of 97.33% and an AUC of 0.94. These findings underscore the potential of combining quantum-resistant encryption mechanisms with AI-driven IDS to create a robust, scalable, and secure communication framework for UAV networks, particularly within the high-performance requirements of 5G environments.

ITNov 4, 2024

Communicate Less, Synthesize the Rest: Latency-aware Intent-based Generative Semantic Multicasting with Diffusion Models

Xinkai Liu, Mahdi Boloursaz Mashhadi, Li Qiao et al.

Generative diffusion models (GDMs) have recently shown great success in synthesizing multimedia signals with high perceptual quality, enabling highly efficient semantic communications in future wireless networks. In this paper, we develop an intent-aware generative semantic multicasting framework utilizing pre-trained diffusion models. In the proposed framework, the transmitter decomposes the source signal into multiple semantic classes based on the multi-user intent, i.e. each user is assumed to be interested in details of only a subset of the semantic classes. To better utilize the wireless resources, the transmitter sends to each user only its intended classes, and multicasts a highly compressed semantic map to all users over shared wireless resources that allows them to locally synthesize the other classes, namely non-intended classes, utilizing pre-trained diffusion models. The signal retrieved at each user is thereby partially reconstructed and partially synthesized utilizing the received semantic map. We design a communication/computation-aware scheme for per-class adaptation of the communication parameters, such as the transmission power and compression rate, to minimize the total latency of retrieving signals at multiple receivers, tailored to the prevailing channel conditions as well as the users' reconstruction/synthesis distortion/perception requirements. The simulation results demonstrate significantly reduced per-user latency compared with non-generative and intent-unaware multicasting benchmarks while maintaining high perceptual quality of the signals retrieved at the users.

ITOct 28, 2025

Resi-VidTok: An Efficient and Decomposed Progressive Tokenization Framework for Ultra-Low-Rate and Lightweight Video Transmission

Zhenyu Liu, Yi Ma, Rahim Tafazolli et al.

Real-time transmission of video over wireless networks remains highly challenging, even with advanced deep models, particularly under severe channel conditions such as limited bandwidth and weak connectivity. In this paper, we propose Resi-VidTok, a Resilient Tokenization-Enabled framework designed for ultra-low-rate and lightweight video transmission that delivers strong robustness while preserving perceptual and semantic fidelity on commodity digital hardware. By reorganizing spatio--temporal content into a discrete, importance-ordered token stream composed of key tokens and refinement tokens, Resi-VidTok enables progressive encoding, prefix-decodable reconstruction, and graceful quality degradation under constrained channels. A key contribution is a resilient 1D tokenization pipeline for video that integrates differential temporal token coding, explicitly supporting reliable recovery from incomplete token sets using a single shared framewise decoder--without auxiliary temporal extractors or heavy generative models. Furthermore, stride-controlled frame sparsification combined with a lightweight decoder-side interpolator reduces transmission load while maintaining motion continuity. Finally, a channel-adaptive source--channel coding and modulation scheme dynamically allocates rate and protection according to token importance and channel condition, yielding stable quality across adverse SNRs. Evaluation results indicate robust visual and semantic consistency at channel bandwidth ratios (CBR) as low as 0.0004 and real-time reconstruction at over 30 fps, demonstrating the practicality of Resi-VidTok for energy-efficient, latency-sensitive, and reliability-critical wireless applications.

IVOct 15, 2025

Semantic Communication Enabled Holographic Video Processing and Transmission

Jingkai Ying, Zhiyuan Qi, Yulong Feng et al.

Holographic video communication is considered a paradigm shift in visual communications, becoming increasingly popular for its ability to offer immersive experiences. This article provides an overview of holographic video communication and outlines the requirements of a holographic video communication system. Particularly, following a brief review of semantic com- munication, an architecture for a semantic-enabled holographic video communication system is presented. Key technologies, including semantic sampling, joint semantic-channel coding, and semantic-aware transmission, are designed based on the proposed architecture. Two related use cases are presented to demonstrate the performance gain of the proposed methods. Finally, potential research topics are discussed to pave the way for the realization of semantic-enabled holographic video communications.

CRJul 27, 2025

Interpretable Anomaly-Based DDoS Detection in AI-RAN with XAI and LLMs

Sotiris Chatzimiltis, Mohammad Shojafar, Mahdi Boloursaz Mashhadi et al.

Next generation Radio Access Networks (RANs) introduce programmability, intelligence, and near real-time control through intelligent controllers, enabling enhanced security within the RAN and across broader 5G/6G infrastructures. This paper presents a comprehensive survey highlighting opportunities, challenges, and research gaps for Large Language Models (LLMs)-assisted explainable (XAI) intrusion detection (IDS) for secure future RAN environments. Motivated by this, we propose an LLM interpretable anomaly-based detection system for distributed denial-of-service (DDoS) attacks using multivariate time series key performance measures (KPMs), extracted from E2 nodes, within the Near Real-Time RAN Intelligent Controller (Near-RT RIC). An LSTM-based model is trained to identify malicious User Equipment (UE) behavior based on these KPMs. To enhance transparency, we apply post-hoc local explainability methods such as LIME and SHAP to interpret individual predictions. Furthermore, LLMs are employed to convert technical explanations into natural-language insights accessible to non-expert users. Experimental results on real 5G network KPMs demonstrate that our framework achieves high detection accuracy (F1-score > 0.96) while delivering actionable and interpretable outputs.

CVJun 3, 2025

Channel-adaptive Cross-modal Generative Semantic Communication for Point Cloud Transmission

Wanting Yang, Zehui Xiong, Qianqian Yang et al.

With the rapid development of autonomous driving and extended reality, efficient transmission of point clouds (PCs) has become increasingly important. In this context, we propose a novel channel-adaptive cross-modal generative semantic communication (SemCom) for PC transmission, called GenSeC-PC. GenSeC-PC employs a semantic encoder that fuses images and point clouds, where images serve as non-transmitted side information. Meanwhile, the decoder is built upon the backbone of PointDif. Such a cross-modal design not only ensures high compression efficiency but also delivers superior reconstruction performance compared to PointDif. Moreover, to ensure robust transmission and reduce system complexity, we design a streamlined and asymmetric channel-adaptive joint semantic-channel coding architecture, where only the encoder needs the feedback of average signal-to-noise ratio (SNR) and available bandwidth. In addition, rectified denoising diffusion implicit models is employed to accelerate the decoding process to the millisecond level, enabling real-time PC communication. Unlike existing methods, GenSeC-PC leverages generative priors to ensure reliable reconstruction even from noisy or incomplete source PCs. More importantly, it supports fully analog transmission, improving compression efficiency by eliminating the need for error-free side information transmission common in prior SemCom approaches. Simulation results confirm the effectiveness of cross-modal semantic extraction and dual-metric guided fine-tuning, highlighting the framework's robustness across diverse conditions, including low SNR, bandwidth limitations, varying numbers of 2D images, and previously unseen objects.

LGApr 17, 2024

Use of Parallel Explanatory Models to Enhance Transparency of Neural Network Configurations for Cell Degradation Detection

David Mulvey, Chuan Heng Foh, Muhammad Ali Imran et al.

In a previous paper, we have shown that a recurrent neural network (RNN) can be used to detect cellular network radio signal degradations accurately. We unexpectedly found, though, that accuracy gains diminished as we added layers to the RNN. To investigate this, in this paper, we build a parallel model to illuminate and understand the internal operation of neural networks, such as the RNN, which store their internal state in order to process sequential inputs. This model is widely applicable in that it can be used with any input domain where the inputs can be represented by a Gaussian mixture. By looking at the RNN processing from a probability density function perspective, we are able to show how each layer of the RNN transforms the input distributions to increase detection accuracy. At the same time we also discover a side effect acting to limit the improvement in accuracy. To demonstrate the fidelity of the model we validate it against each stage of RNN processing as well as the output predictions. As a result, we have been able to explain the reasons for the RNN performance limits with useful insights for future designs for RNNs and similar types of neural network.

CRJun 6, 2020

Online Advertising Security: Issues, Taxonomy, and Future Directions

Zahra Pooranian, Mauro Conti, Hamed Haddadi et al.

Online advertising has become the backbone of the Internet economy by revolutionizing business marketing. It provides a simple and efficient way for advertisers to display their advertisements to specific individual users, and over the last couple of years has contributed to an explosion in the income stream for several web-based businesses. For example, Google's income from advertising grew 51.6% between 2016 and 2018, to $136.8 billion. This exponential growth in advertising revenue has motivated fraudsters to exploit the weaknesses of the online advertising model to make money, and researchers to discover new security vulnerabilities in the model, to propose countermeasures and to forecast future trends in research. Motivated by these considerations, this paper presents a comprehensive review of the security threats to online advertising systems. We begin by introducing the motivation for online advertising system, explain how it differs from traditional advertising networks, introduce terminology, and define the current online advertising architecture. We then devise a comprehensive taxonomy of attacks on online advertising to raise awareness among researchers about the vulnerabilities of online advertising ecosystem. We discuss the limitations and effectiveness of the countermeasures that have been developed to secure entities in the advertising ecosystem against these attacks. To complete our work, we identify some open issues and outline some possible directions for future research towards improving security methods for online advertising systems.

SPApr 14, 2020

On Deep Learning Solutions for Joint Transmitter and Noncoherent Receiver Design in MU-MIMO Systems

Songyan Xue, Yi Ma, Na Yi et al.

This paper aims to handle the joint transmitter and noncoherent receiver design for multiuser multiple-input multiple-output (MU-MIMO) systems through deep learning. Given the deep neural network (DNN) based noncoherent receiver, the novelty of this work mainly lies in the multiuser waveform design at the transmitter side. According to the signal format, the proposed deep learning solutions can be divided into two groups. One group is called pilot-aided waveform, where the information-bearing symbols are time-multiplexed with the pilot symbols. The other is called learning-based waveform, where the multiuser waveform is partially or even completely designed by deep learning algorithms. Specifically, if the information-bearing symbols are directly embedded in the waveform, it is called systematic waveform. Otherwise, it is called non-systematic waveform, where no artificial design is involved. Simulation results show that the pilot-aided waveform design outperforms the conventional zero forcing receiver with least squares (LS) channel estimation on small-size MU-MIMO systems. By exploiting the time-domain degrees of freedom (DoF), the learning-based waveform design further improves the detection performance by at least 5 dB at high signal-to-noise ratio (SNR) range. Moreover, it is found that the traditional weight initialization method might cause a training imbalance among different users in the learning-based waveform design. To tackle this issue, a novel weight initialization method is proposed which provides a balanced convergence performance with no complexity penalty.