Fatih Kamisli

h-index9

7papers

50citations

Novelty44%

AI Score25

Ranked #169,058 of 201,326 authors (top 84%)#118 in MM (top 45%)

7 Papers

IVFeb 29, 2024Code

Variable-Rate Learned Image Compression with Multi-Objective Optimization and Quantization-Reconstruction Offsets

Fatih Kamisli, Fabien Racape, Hyomin Choi

Achieving successful variable bitrate compression with computationally simple algorithms from a single end-to-end learned image or video compression model remains a challenge. Many approaches have been proposed, including conditional auto-encoders, channel-adaptive gains for the latent tensor or uniformly quantizing all elements of the latent tensor. This paper follows the traditional approach to vary a single quantization step size to perform uniform quantization of all latent tensor elements. However, three modifications are proposed to improve the variable rate compression performance. First, multi objective optimization is used for (post) training. Second, a quantization-reconstruction offset is introduced into the quantization operation. Third, variable rate quantization is used also for the hyper latent. All these modifications can be made on a pre-trained single-rate compression model by performing post training. The algorithms are implemented into three well-known image compression models and the achieved variable rate compression results indicate negligible or minimal compression performance loss compared to training multiple models. (Codes will be shared at https://github.com/InterDigitalInc/CompressAI)

IVMar 22, 2022

End-to-End Learned Block-Based Image Compression with Block-Level Masked Convolutions and Asymptotic Closed Loop Training

Fatih Kamisli

Learned image compression research has achieved state-of-the-art compression performance with auto-encoder based neural network architectures, where the image is mapped via convolutional neural networks (CNN) into a latent representation that is quantized and processed again with CNN to obtain the reconstructed image. CNN operate on entire input images. On the other hand, traditional state-of-the-art image and video compression methods process images with a block-by-block processing approach for various reasons. Very recently, work on learned image compression with block based approaches have also appeared, which use the auto-encoder architecture on large blocks of the input image and introduce additional neural networks that perform intra/spatial prediction and deblocking/post-processing functions. This paper explores an alternative learned block-based image compression approach in which neither an explicit intra prediction neural network nor an explicit deblocking neural network is used. A single auto-encoder neural network with block-level masked convolutions is used and the block size is much smaller (8x8). By using block-level masked convolutions, each block is processed using reconstructed neighboring left and upper blocks both at the encoder and decoder. Hence, the mutual information between adjacent blocks is exploited during compression and each block is reconstructed using neighboring blocks, resolving the need for explicit intra prediction and deblocking neural networks. Since the explored system is a closed loop system, a special optimization procedure, the asymptotic closed loop design, is used with standard stochastic gradient descent based training. The experimental results indicate competitive image compression performance.

MMAug 23, 2017

Lossless Image and Intra-frame Compression with Integer-to-Integer DST

Fatih Kamisli

Video coding standards are primarily designed for efficient lossy compression, but it is also desirable to support efficient lossless compression within video coding standards using small modifications to the lossy coding architecture. A simple approach is to skip transform and quantization, and simply entropy code the prediction residual. However, this approach is inefficient at compression. A more efficient and popular approach is to skip transform and quantization but also process the residual block with DPCM, along the horizontal or vertical direction, prior to entropy coding. This paper explores an alternative approach based on processing the residual block with integer-to-integer (i2i) transforms. I2i transforms can map integer pixels to integer transform coefficients without increasing the dynamic range and can be used for lossless compression. We focus on lossless intra coding and develop novel i2i approximations of the odd type-3 DST (ODST-3). Experimental results with the HEVC reference software show that the developed i2i approximations of the ODST-3 improve lossless intra-frame compression efficiency with respect to HEVC version 2, which uses the popular DPCM method, by an average 2.7% without a significant effect on computational complexity.

MMMay 17, 2016

Lossless Intra Coding in HEVC with Integer-to-Integer DST

Fatih Kamisli

It is desirable to support efficient lossless coding within video coding standards, which are primarily designed for lossy coding, with as little modification as possible. A simple approach is to skip transform and quantization, and directly entropy code the prediction residual, but this is inefficient for compression. A more efficient and popular approach is to process the residual block with DPCM prior to entropy coding. This paper explores an alternative approach based on processing the residual block with integer-to-integer (i2i) transforms. I2i transforms map integers to integers, however, unlike the integer transforms used in HEVC for lossy coding, they do not increase the dynamic range at the output and can be used in lossless coding. We use both an i2i DCT from the literature and a novel i2i approximation of the DST. Experiments with the HEVC reference software show competitive results.

MMMay 17, 2016

Lossless Compression in HEVC with Integer-to-Integer Transforms

Fatih Kamisli

Many approaches have been proposed to support lossless coding within video coding standards that are primarily designed for lossy coding. The simplest approach is to just skip transform and quantization and directly entropy code the prediction residual, which is used in HEVC version 1. However, this simple approach is inefficient for compression. More efficient approaches include processing the residual with DPCM prior to entropy coding. This paper explores an alternative approach based on processing the residual with integer-to-integer (i2i) transforms. I2i transforms map integers to integers, however, unlike the integer transforms used in HEVC for lossy coding, they do not increase the dynamic range at the output and can be used in lossless coding. Experiments with the HEVC reference software show competitive results.

MMApr 24, 2016

Lossless Intra Coding in HEVC with Adaptive 3-tap Filters

Saeed Ranjbar Alvar, Fatih Kamisli

In pixel-by-pixel spatial prediction methods for lossless intra coding, the prediction is obtained by a weighted sum of neighbouring pixels. The proposed prediction approach in this paper uses a weighted sum of three neighbor pixels according to a two-dimensional correlation model. The weights are obtained after a three step optimization procedure. The first two stages are offline procedures where the computed prediction weights are obtained offline from training sequences. The third stage is an online optimization procedure where the offline obtained prediction weights are further fine-tuned and adapted to each encoded block during encoding using a rate-distortion optimized method and the modification in this third stage is transmitted to the decoder as side information. The results of the simulations show average bit rate reductions of 12.02% and 3.28% over the default lossless intra coding in HEVC and the well-known Sample-based Angular Prediction (SAP) method, respectively.

MMJan 18, 2016

Lossless Intra Coding in HEVC with 3-tap Filters

Saeed R. Alvar, Fatih Kamisli

This paper presents a pixel-by-pixel spatial prediction method for lossless intra coding within High Efficiency Video Coding (HEVC). A well-known previous pixel-by-pixel spatial prediction method uses only two neighboring pixels for prediction, based on the angular projection idea borrowed from block-based intra prediction in lossy coding. This paper explores a method which uses three neighboring pixels for prediction according to a two-dimensional correlation model, and the used neighbor pixels and prediction weights change depending on intra mode. To find the best prediction weights for each intra mode, a two-stage offline optimization algorithm is used and a number of implementation aspects are discussed to simplify the proposed prediction method. The proposed method is implemented in the HEVC reference software and experimental results show that the explored 3-tap filtering method can achieve an average 11.34% bitrate reduction over the default lossless intra coding in HEVC. The proposed method also decreases average decoding time by 12.7% while it increases average encoding time by 9.7%