Generalized Rectifier Wavelet Covariance Models For Texture Synthesis
This work addresses texture synthesis for computer vision applications, offering an incremental improvement by combining wavelet meaningfulness with neural network-like performance.
The authors tackled the problem of texture synthesis by proposing a family of statistics based on non-linear wavelet representations, which improve visual quality over previous wavelet-based models and achieve similar quality to state-of-the-art models on gray-scale and color textures.
State-of-the-art maximum entropy models for texture synthesis are built from statistics relying on image representations defined by convolutional neural networks (CNN). Such representations capture rich structures in texture images, outperforming wavelet-based representations in this regard. However, conversely to neural networks, wavelets offer meaningful representations, as they are known to detect structures at multiple scales (e.g. edges) in images. In this work, we propose a family of statistics built upon non-linear wavelet based representations, that can be viewed as a particular instance of a one-layer CNN, using a generalized rectifier non-linearity. These statistics significantly improve the visual quality of previous classical wavelet-based models, and allow one to produce syntheses of similar quality to state-of-the-art models, on both gray-scale and color textures.