IVDec 4, 2024Code
Video Quality Assessment: A Comprehensive SurveyQi Zheng, Yibo Fan, Leilei Huang et al.
Video quality assessment (VQA) is an important processing task, aiming at predicting the quality of videos in a manner highly consistent with human judgments of perceived quality. Traditional VQA models based on natural image and/or video statistics, which are inspired both by models of projected images of the real world and by dual models of the human visual system, deliver only limited prediction performances on real-world user-generated content (UGC), as exemplified in recent large-scale VQA databases containing large numbers of diverse video contents crawled from the web. Fortunately, recent advances in deep neural networks and Large Multimodality Models (LMMs) have enabled significant progress in solving this problem, yielding better results than prior handcrafted models. Numerous deep learning-based VQA models have been developed, with progress in this direction driven by the creation of content-diverse, large-scale human-labeled databases that supply ground truth psychometric video quality data. Here, we present a comprehensive survey of recent progress in the development of VQA algorithms and the benchmarking studies and databases that make them possible. We also analyze open research directions on study design and VQA algorithm architectures. Github link: https://github.com/taco-group/Video-Quality-Assessment-A-Comprehensive-Survey.
CVDec 11, 2024Code
Unicorn: Unified Neural Image Compression with One Number ReconstructionQi Zheng, Haozhi Wang, Zihao Liu et al.
Prevalent lossy image compression schemes can be divided into: 1) explicit image compression (EIC), including traditional standards and neural end-to-end algorithms; 2) implicit image compression (IIC) based on implicit neural representations (INR). The former is encountering impasses of either leveling off bitrate reduction at a cost of tremendous complexity while the latter suffers from excessive smoothing quality as well as lengthy decoder models. In this paper, we propose an innovative paradigm, which we dub \textbf{Unicorn} (\textbf{U}nified \textbf{N}eural \textbf{I}mage \textbf{C}ompression with \textbf{O}ne \textbf{N}number \textbf{R}econstruction). By conceptualizing the images as index-image pairs and learning the inherent distribution of pairs in a subtle neural network model, Unicorn can reconstruct a visually pleasing image from a randomly generated noise with only one index number. The neural model serves as the unified decoder of images while the noises and indexes corresponds to explicit representations. As a proof of concept, we propose an effective and efficient prototype of Unicorn based on latent diffusion models with tailored model designs. Quantitive and qualitative experimental results demonstrate that our prototype achieves significant bitrates reduction compared with EIC and IIC algorithms. More impressively, benefitting from the unified decoder, our compression ratio escalates as the quantity of images increases. We envision that more advanced model designs will endow Unicorn with greater potential in image compression. We will release our codes in \url{https://github.com/uniqzheng/Unicorn-Laduree}.
IVOct 12, 2025Code
JND-Guided Light-Weight Neural Pre-Filter for Perceptual Image CodingChenlong He, Zhijian Hao, Leilei Huang et al.
Just Noticeable Distortion (JND)-guided pre-filter is a promising technique for improving the perceptual compression efficiency of image coding. However, existing methods are often computationally expensive, and the field lacks standardized benchmarks for fair comparison. To address these challenges, this paper introduces a twofold contribution. First, we develop and open-source FJNDF-Pytorch, a unified benchmark for frequency-domain JND-Guided pre-filters. Second, leveraging this platform, we propose a complete learning framework for a novel, lightweight Convolutional Neural Network (CNN). Experimental results demonstrate that our proposed method achieves state-of-the-art compression efficiency, consistently outperforming competitors across multiple datasets and encoders. In terms of computational cost, our model is exceptionally lightweight, requiring only 7.15 GFLOPs to process a 1080p image, which is merely 14.1% of the cost of recent lightweight network. Our work presents a robust, state-of-the-art solution that excels in both performance and efficiency, supported by a reproducible research platform. The open-source implementation is available at https://github.com/viplab-fudan/FJNDF-Pytorch.
ARMar 31
HLC: A High-Quality Lightweight Mezzanine Codec Featuring High-Throughput PaletteChenlong He, Leilei Huang, Wei Li et al.
Existing mezzanine image codecs lack specialized screen content coding tools and therefore struggle to maintain high image quality under bandwidth constraints, especially in areas with dense text. Although distribution codecs offer advanced screen content compression techniques, their high computational complexity makes them impractical for mezzanine coding. To address this shortfall, we introduce the High-quality Lightweight Codec (HLC), a solution centered on enabling practical, high-throughput palette for mezzanine coding. The core innovation is a novel data-dependency-free palette that eliminates the throughput bottlenecks. To ensure its effectiveness across all content, a co-designed rate-distortion optimization module arbitrates between the palette and traditional prediction modes, while a data reuse strategy between rate estimation and entropy coding minimizes the overall hardware resources required for the system. Experimental results show that, compared with a 4K@120fps JPEG-XS encoder, HLC achieves the same throughput while using only half the LUT resources and delivers BD-PSNR improvements of 3.461dB, 3.299dB, and 5.312dB on gaming, natural, and text content datasets, respectively.
ROOct 21, 2020
Bidirectional Microrocker Bots Controlled via Neutral Position OffsetTony Wang, DeaGyu Kim, Yifan Shi et al.
The recent advancements in nanoscale 3D printing and microfabrication techniques have reinvigorated research on microrobots. However, precise motion control of the microrobots on biological environments using compact actuation setups remains challenging to date. This work presents a novel control mechanism and contact design that enables bidirectional steering via biasing the neutral position of the microrobot. Equipped with rockers to contact the substrate, the microrobot, hence microrocker bot, is capable of well-controlled forward and backward movement on flat and non-flat biological surfaces. The 100um by 113um by 36um robots were 3D printed via two-photon lithography and subsequently deposited with nickel thin films. Under a relatively small static magnetic field, the microrocker bot tilts either forward or backward to align the thin film magnetization direction with the magnetic field lines. When combined with an oscillating magnetic field, the robot undergoes stick-slip motion in the predisposed direction, dictated by the neutral position tilt. The microrocker bots are further equipped with sharp mechanical tips that can be selectively engaged. When the frequency and offset of the actuation sawtooth waveform are optimized, the robot travels up to 100um/s (1 body length per second) forward and backward showing very linear trajectories. Finally, to prove the functionality of the microrocker bots in direct contact with biological surfaces, we demonstrate the robot's ability to traverse forward and backward on the surface of a Dracaena Fragrans leaf, and upend/engage on its mechanical tip.
SYFeb 24, 2020
On the Forward and Backward Motion of Milli-Bristle-BotsDeaGyu Kim, Zhijian Hao, Ali Reza Mohazab et al.
This works presents the theoretical analysis and experimental observations of bidirectional motion of a millimeter-scale bristle robot (milli-bristle-bot) with an on-board piezoelectric actuator. First, the theory of the motion, based on the dry-friction model, is developed and the frequency regions of the forward and backward motion, along with resonant frequencies of the system are predicted. Secondly, milli-bristle-bots with two different bristle tilt angles are fabricated, and their bidirectional motions are experimentally investigated. The dependency of the robot speed on the actuation frequency is studied,which reveals two distinct frequency regions for the forward and backward motions that well matches our theoretical predictions. Furthermore, the dependencies of the resonance frequency and robot speed on the bristle tilt angle are experimentally studied and tied to the theoretical model. This work marks the first demonstration of bidirectional motion at millimeter-scales, achieved for bristle-bots with a single on-board actuator.