Timothy B. Terriberry

10papers

273citations

Novelty44%

AI Score23

Ranked #179,702 of 201,326 authors (top 89%)#150 in MM (top 58%)

10 Papers

MMOct 8, 2016

Perceptually-Driven Video Coding with the Daala Video Codec

Yushin Cho, Thomas J. Daede, Nathan E. Egge et al.

The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which ones worked and which did not. We evaluate which tools are easy to integrate into a more traditional codec design, and show results in the context of the codec being developed by the Alliance for Open Media.

MMAug 5, 2016

Daala: Building A Next-Generation Video Codec From Unconventional Technology

Jean-Marc Valin, Timothy B. Terriberry, Nathan E. Egge et al.

Daala is a new royalty-free video codec that attempts to compete with state-of-the-art royalty-bearing codecs. To do so, it must achieve good compression while avoiding all of their patented techniques. We use technology that is as different as possible from traditional approaches to achieve this. This paper describes the technology behind Daala and discusses where it fits in the newly created AV1 codec from the Alliance for Open Media. We show that Daala is approaching the performance level of more mature, state-of-the art video codecs and can contribute to improving AV1.

MMMay 16, 2016

Daala: A Perceptually-Driven Still Picture Codec

Jean-Marc Valin, Nathan E. Egge, Thomas Daede et al.

Daala is a new royalty-free video codec based on perceptually-driven coding techniques. We explore using its keyframe format for still picture coding and show how it has improved over the past year. We believe the technology used in Daala could be the basis of an excellent, royalty-free image format.

MMMar 10, 2016

Daala: A Perceptually-Driven Next Generation Video Codec

Thomas J. Daede, Nathan E. Egge, Jean-Marc Valin et al.

The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which ones worked and which did not, and what we've learned from them. The result is a codec which compares favorably with HEVC on still images, and is on a path to do so for video as well.

SDMar 6, 2016

Low-Complexity Iterative Sinusoidal Parameter Estimation

Jean-Marc Valin, Daniel V. Smith, Christopher Montgomery et al.

Sinusoidal parameter estimation is a computationally-intensive task, which can pose problems for real-time implementations. In this paper, we propose a low-complexity iterative method for estimating sinusoidal parameters that is based on the linearisation of the model around an initial frequency estimate. We show that for N sinusoids in a frame of length L, the proposed method has a complexity of O(LN), which is significantly less than the matching pursuits method. Furthermore, the proposed method is shown to be more accurate than the matching pursuits and time frequency reassignment methods in our experiments.

SDFeb 17, 2016

A High-Quality Speech and Audio Codec With Less Than 10 ms Delay

Jean-Marc Valin, Timothy B. Terriberry, Christopher Montgomery et al.

With increasing quality requirements for multimedia communications, audio codecs must maintain both high quality and low delay. Typically, audio codecs offer either low delay or high quality, but rarely both. We propose a codec that simultaneously addresses both these requirements, with a delay of only 8.7 ms at 44.1 kHz. It uses gain-shape algebraic vector quantisation in the frequency domain with time-domain pitch prediction. We demonstrate that the proposed codec operating at 48 kbit/s and 64 kbit/s out-performs both G.722.1C and MP3 and has quality comparable to AAC-LD, despite having less than one fourth of the algorithmic delay of these codecs.

SDFeb 17, 2016

An Iterative Linearised Solution to the Sinusoidal Parameter Estimation Problem

Jean-Marc Valin, Daniel V. Smith, Christopher Montgomery et al.

Signal processing applications use sinusoidal modelling for speech synthesis, speech coding, and audio coding. Estimation of the model parameters involves non-linear optimisation methods, which can be very costly for real-time applications. We propose a low-complexity iterative method that starts from initial frequency estimates and converges rapidly. We show that for N sinusoids in a frame of length L, the proposed method has a complexity of O(LN), which is significantly less than the matching pursuits method. Furthermore, the proposed method is shown to be more accurate than the matching pursuits and time-frequency reassignment methods in our experiments.

MMFeb 17, 2016

A Full-Bandwidth Audio Codec With Low Complexity And Very Low Delay

Jean-Marc Valin, Timothy B. Terriberry, Gregory Maxwell

We propose an audio codec that addresses the low-delay requirements of some applications such as network music performance. The codec is based on the modified discrete cosine transform (MDCT) with very short frames and uses gain-shape quantization to preserve the spectral envelope. The short frame sizes required for low delay typically hinder the performance of transform codecs. However, at 96 kbit/s and with only 4 ms algorithmic delay, the proposed codec out-performs the ULD codec operating at the same rate. The total complexity of the codec is small, at only 17 WMOPS for real-time operation at 48 kHz.

MMFeb 16, 2016

Perceptual Vector Quantization For Video Coding

Jean-Marc Valin, Timothy B. Terriberry

This paper applies energy conservation principles to the Daala video codec using gain-shape vector quantization to encode a vector of AC coefficients as a length (gain) and direction (shape). The technique originates from the CELT mode of the Opus audio codec, where it is used to conserve the spectral envelope of an audio signal. Conserving energy in video has the potential to preserve textures rather than low-passing them. Explicitly quantizing a gain allows a simple contrast masking model with no signaling cost. Vector quantizing the shape keeps the number of degrees of freedom the same as scalar quantization, avoiding redundancy in the representation. We demonstrate how to predict the vector by transforming the space it is encoded in, rather than subtracting off the predictor, which would make energy conservation impossible. We also derive an encoding of the vector-quantized codewords that takes advantage of their non-uniform distribution. We show that the resulting technique outperforms scalar quantization by an average of 0.90 dB on still images, equivalent to a 24.8% reduction in bitrate at equal quality, while for videos, the improvement averages 0.83 dB, equivalent to a 13.7% reduction in bitrate.

MMFeb 15, 2016

High-Quality, Low-Delay Music Coding in the Opus Codec

Jean-Marc Valin, Gregory Maxwell, Timothy B. Terriberry et al.

The IETF recently standardized the Opus codec as RFC6716. Opus targets a wide range of real-time Internet applications by combining a linear prediction coder with a transform coder. We describe the transform coder, with particular attention to the psychoacoustic knowledge built into the format. The result out-performs existing audio codecs that do not operate under real-time constraints.