CRJun 1
Implementation and Optimization of HQC Decoding on NPU-Integrated DevicesVu Minh Chau, Nguyen Ngoc Kiet, Pham Quang Minh et al.
Hamming Quasi-Cyclic (HQC) has been selected by NIST for standardization as an additional code-based key-encapsulation mechanism, providing algorithmic diversity alongside lattice-based post-quantum cryptography. Efficient deployment of HQC on mobile and embedded platforms, however, requires careful optimization of its decoding procedure, whose Reed-Muller and Reed-Solomon components dominate the computational cost. This paper studies HQC decoding on Qualcomm Hexagon processors in NPU-integrated devices, focusing on the Hexagon Vector eXtensions (HVX) backend rather than a tensor-inference engine. We observe that HQC decoding naturally exposes vector-structured computation, including Reed-Muller reliability vectors, Hadamard-transform coefficients, Reed-Solomon syndrome vectors, finite-field products, and packed support-point evaluations. Based on this observation, we redesign the dominant decoding kernels around HVX-friendly data layouts and execution patterns, including a vectorized Reed-Muller Hadamard transform, scalar-equivalent peak selection, HVX-oriented finite-field arithmetic, vectorized syndrome computation, and shortened-support locator-root evaluation. We implement and evaluate the optimized decoder using both Hexagon simulator measurements and real-device experiments on a Snapdragon~8 Gen~2 hardware development kit. The results show that Hexagon/HVX-assisted decoding substantially reduces latency and energy consumption, improving energy efficiency by up to $18.13\times$ while significantly offloading host CPU work. These results indicate that NPU-integrated mobile platforms can serve as effective backends for structured post-quantum cryptographic decoding when the underlying kernels are reformulated around vector execution.
LGMar 13, 2020Code
Dynamic transformation of prior knowledge into Bayesian models for data streamsTran Xuan Bach, Nguyen Duc Anh, Ngo Van Linh et al.
We consider how to effectively use prior knowledge when learning a Bayesian model from streaming environments where the data come infinitely and sequentially. This problem is highly important in the era of data explosion and rich sources of precious external knowledge such as pre-trained models, ontologies, Wikipedia, etc. We show that some existing approaches can forget any knowledge very fast. We then propose a novel framework that enables to incorporate the prior knowledge of different forms into a base Bayesian model for data streams. Our framework subsumes some existing popular models for time-series/dynamic data. Extensive experiments show that our framework outperforms existing methods with a large margin. In particular, our framework can help Bayesian models generalize well on extremely short text while other methods overfit. The implementation of our framework is available at https://github.com/bachtranxuan/TPS.git.
CVFeb 16, 2025
Learning to Stop Overthinking at Test TimeHieu Tran Bao, Nguyen Cong Dat, Nguyen Duc Anh et al.
Test time scaling is currently one of the most active research areas that shows promise after training time scaling has reached its limits. Deep-thinking (DT) models are a class of recurrent models that can perform easy-to-hard generalization by assigning more compute to harder test samples. However, due to their inability to determine the complexity of a test sample, DT models have to use a large amount of computation for both easy and hard test samples. Excessive test time computation is wasteful and can cause the ``overthinking'' problem where more test time computation leads to worse results. In this paper, we introduce a test time training method for determining the optimal amount of computation needed for each sample during test time. We also propose Conv-LiGRU, a novel recurrent architecture for efficient and robust visual reasoning. Extensive experiments demonstrate that Conv-LiGRU is more stable than DT, effectively mitigates the ``overthinking'' phenomenon, and achieves superior accuracy.