19.9ITApr 21
LLM-Viterbi: Semantic-Aware Decoding for Convolutional CodesZhengtong Li, Chentao Yue, Jiafu Hao et al.
Traditional wireless communications rely solely on bit-level channel coding for error correction, without exploiting the inherent linguistic structure of the data source. This paper proposes a large language model (LLM) Viterbi decoder that integrates LLM priors into the Viterbi decoding for text transmission over AWGN channels. The proposed decoder maintains multiple candidate paths during the Viterbi decoding and periodically evaluates path reliabilities using a fine-tuned Byte-level T5 (ByT5) language model. By combining channel reliability metrics with semantic probability from the LLM, it outputs the path that maximizes the joint likelihood of channel observations and linguistic coherence. Simulations show that our decoder achieves significant performance gains over conventional Viterbi decoding in terms of both block error rate (BLER) and semantic similarity. For convolutional codes with constraint length 3, it achieves approximately 1.5 dB more coding gain in BLER, with over 50% improvements in semantic similarity. The framework can extend to other structured data sources beyond text.
11.6ITApr 24
Semantic Error Correction and Decoding for Short Block Channel CodesJiafu Hao, Chentao Yue, Wanchun Liu et al.
This paper presents a semantic-enhanced receiver framework for transmitting natural language sentences over noisy wireless channels using multiple short block codes. After ASCII encoding, the sentence is divided into segments, each independently encoded with a short block code and transmitted over an AWGN channel. At the receiver, segments are decoded in parallel, followed by a semantic error correction (SEC) model, which reconstructs corrupted segments using language model context. We further propose the semantic list decoding (SLD), which generates multiple candidate reconstructions and selects the best one via weighted Hamming distance, and a semantic confidence-guided HARQ (SHARQ) mechanism that replaces CRC-based error detection with a confidence score, enabling selective segment retransmission without CRC overhead. All modules are designed and trained using bidirectional and auto-regressive transformers (BART). Simulation results demonstrate that the proposed scheme significantly outperforms conventional capacity-approaching short codes and long codes at the same rate. Specifically, SEC provides approximately 0.4 dB BLER gain over plain short-code transmission, while SLD extends this to 0.8 dB. Compared to transmitting the entire sentence as a single long 5G LDPC codeword, our approach significantly improves semantic fidelity and reduces decoding latency by up to 90\%. SHARQ further provides an additional 1.5 dB gain over conventional HARQ.
CVNov 7, 2025
Medical Referring Image Segmentation via Next-Token Mask PredictionXinyu Chen, Yiran Wang, Gaoyang Pang et al.
Medical Referring Image Segmentation (MRIS) involves segmenting target regions in medical images based on natural language descriptions. While achieving promising results, recent approaches usually involve complex design of multimodal fusion or multi-stage decoders. In this work, we propose NTP-MRISeg, a novel framework that reformulates MRIS as an autoregressive next-token prediction task over a unified multimodal sequence of tokenized image, text, and mask representations. This formulation streamlines model design by eliminating the need for modality-specific fusion and external segmentation models, supports a unified architecture for end-to-end training. It also enables the use of pretrained tokenizers from emerging large-scale multimodal models, enhancing generalization and adaptability. More importantly, to address challenges under this formulation-such as exposure bias, long-tail token distributions, and fine-grained lesion edges-we propose three novel strategies: (1) a Next-k Token Prediction (NkTP) scheme to reduce cumulative prediction errors, (2) Token-level Contrastive Learning (TCL) to enhance boundary sensitivity and mitigate long-tail distribution effects, and (3) a memory-based Hard Error Token (HET) optimization strategy that emphasizes difficult tokens during training. Extensive experiments on the QaTa-COV19 and MosMedData+ datasets demonstrate that NTP-MRISeg achieves new state-of-the-art performance, offering a streamlined and effective alternative to traditional MRIS pipelines.