LGApr 28, 2025
FineQ: Software-Hardware Co-Design for Low-Bit Fine-Grained Mixed-Precision Quantization of LLMsXilong Xie, Liang Wang, Limin Xiao et al.
Large language models (LLMs) have significantly advanced the natural language processing paradigm but impose substantial demands on memory and computational resources. Quantization is one of the most effective ways to reduce memory consumption of LLMs. However, advanced single-precision quantization methods experience significant accuracy degradation when quantizing to ultra-low bits. Existing mixed-precision quantization methods are quantized by groups with coarse granularity. Employing high precision for group data leads to substantial memory overhead, whereas low precision severely impacts model accuracy. To address this issue, we propose FineQ, software-hardware co-design for low-bit fine-grained mixed-precision quantization of LLMs. First, FineQ partitions the weights into finer-grained clusters and considers the distribution of outliers within these clusters, thus achieving a balance between model accuracy and memory overhead. Then, we propose an outlier protection mechanism within clusters that uses 3 bits to represent outliers and introduce an encoding scheme for index and data concatenation to enable aligned memory access. Finally, we introduce an accelerator utilizing temporal coding that effectively supports the quantization algorithm while simplifying the multipliers in the systolic array. FineQ achieves higher model accuracy compared to the SOTA mixed-precision quantization algorithm at a close average bit-width. Meanwhile, the accelerator achieves up to 1.79x energy efficiency and reduces the area of the systolic array by 61.2%.
CVMar 3, 2020
Fully Convolutional Networks for Automatically Generating Image Masks to Train Mask R-CNNHao Wu, Jan Paul Siebert, Xiangrong Xu
This paper proposes a novel automatically generating image masks method for the state-of-the-art Mask R-CNN deep learning method. The Mask R-CNN method achieves the best results in object detection until now, however, it is very time-consuming and laborious to get the object Masks for training, the proposed method is composed by a two-stage design, to automatically generating image masks, the first stage implements a fully convolutional networks (FCN) based segmentation network, the second stage network, a Mask R-CNN based object detection network, which is trained on the object image masks from FCN output, the original input image, and additional label information. Through experimentation, our proposed method can obtain the image masks automatically to train Mask R-CNN, and it can achieve very high classification accuracy with an over 90% mean of average precision (mAP) for segmentation
CVMay 16, 2019
Uneven illumination surface defects inspection based on convolutional neural networkHao Wu, Yulong Liu, Wenbin Gao et al.
Surface defect inspection based on machine vision is often affected by uneven illumination. In order to improve the inspection rate of surface defects inspection under uneven illumination condition, this paper proposes a method for detecting surface image defects based on convolutional neural network, which is based on the adjustment of convolutional neural networks, training parameters, changing the structure of the network, to achieve the purpose of accurately identifying various defects. Experimental on defect inspection of copper strip and steel images shows that the convolutional neural network can automatically learn features without preprocessing the image, and correct identification of various types of image defects affected by uneven illumination, thus overcoming the drawbacks of traditional machine vision inspection methods under uneven illumination.