R. J. Cintra

MM
h-index28
46papers
1,009citations
Novelty45%
AI Score30

46 Papers

MEJun 3, 2022
The Gamma Generalized Normal Distribution: A Descriptor of SAR Imagery

G. M. Cordeiro, R. J. Cintra, L. C. Rêgo et al.

We propose a new four-parameter distribution for modeling synthetic aperture radar (SAR) imagery named the gamma generalized normal (GGN) by combining the gamma and generalized normal distributions. A mathematical characterization of the new distribution is provided by identifying the limit behavior and by calculating the density and moment expansions. The GGN model performance is evaluated on both synthetic and actual data and, for that, maximum likelihood estimation and random number generation are discussed. The proposed distribution is compared with the beta generalized normal distribution (BGN), which has already shown to appropriately represent SAR imagery. The performance of these two distributions are measured by means of statistics which provide evidence that the GGN can outperform the BGN distribution in some contexts.

LGJul 29, 2022
Low-complexity Approximate Convolutional Neural Networks

R. J. Cintra, S. Duffner, C. Garcia et al.

In this paper, we present an approach for minimizing the computational complexity of trained Convolutional Neural Networks (ConvNet). The idea is to approximate all elements of a given ConvNet and replace the original convolutional filters and parameters (pooling and bias coefficients; and activation function) with efficient approximations capable of extreme reductions in computational complexity. Low-complexity convolution filters are obtained through a binary (zero-one) linear programming scheme based on the Frobenius norm over sets of dyadic rationals. The resulting matrices allow for multiplication-free computations requiring only addition and bit-shifting operations. Such low-complexity structures pave the way for low-power, efficient hardware designs. We applied our approach on three use cases of different complexity: (i) a "light" but efficient ConvNet for face detection (with around 1000 parameters); (ii) another one for hand-written digit classification (with more than 180000 parameters); and (iii) a significantly larger ConvNet: AlexNet with $\approx$1.2 million matrices. We evaluated the overall performance on the respective tasks for different levels of approximations. In all considered applications, very low-complexity approximations have been derived maintaining an almost equal classification performance.

MEJan 8, 2015
Multi-Beam RF Aperture Using Multiplierless FFT Approximation

D. Suarez, R. J. Cintra, F. M. Bayer et al.

Multiple independent radio frequency (RF) beams find applications in communications, radio astronomy, radar, and microwave imaging. An $N$-point FFT applied spatially across an array of receiver antennas provides $N$-independent RF beams at $\frac{N}{2}\log_2N$ multiplier complexity. Here, a low-complexity multiplierless approximation for the 8-point FFT is presented for RF beamforming, using only 26 additions. The algorithm provides eight beams that closely resemble the antenna array patterns of the traditional FFT-based beamformer albeit without using multipliers. The proposed FFT-like algorithm is useful for low-power RF multi-beam receivers; being synthesized in 45 nm CMOS technology at 1.1 V supply, and verified on-chip using a Xilinx Virtex-6 Lx240T FPGA device. The CMOS simulation and FPGA implementation indicate bandwidths of 588 MHz and 369 MHz, respectively, for each of the independent receive-mode RF beams.

ITMay 27, 2016
An Orthogonal 16-point Approximate DCT for Image and Video Compression

T. L. T. da Silveira, F. M. Bayer, R. J. Cintra et al.

A low-complexity orthogonal multiplierless approximation for the 16-point discrete cosine transform (DCT) was introduced. The proposed method was designed to possess a very low computational cost. A fast algorithm based on matrix factorization was proposed requiring only 60~additions. The proposed architecture outperforms classical and state-of-the-art algorithms when assessed as a tool for image and video compression. Digital VLSI hardware implementations were also proposed being physically realized in FPGA technology and implemented in 45 nm up to synthesis and place-route levels. Additionally, the proposed method was embedded into a high efficiency video coding (HEVC) reference software for actual proof-of-concept. Obtained results show negligible video degradation when compared to Chen DCT algorithm in HEVC.

NAJul 21, 2018
Fast Matrix Inversion and Determinant Computation for Polarimetric Synthetic Aperture Radar

D. F. G. Coelho, R. J. Cintra, A. C. Frery et al.

This paper introduces a fast algorithm for simultaneous inversion and determinant computation of small sized matrices in the context of fully Polarimetric Synthetic Aperture Radar (PolSAR) image processing and analysis. The proposed fast algorithm is based on the computation of the adjoint matrix and the symmetry of the input matrix. The algorithm is implemented in a general purpose graphical processing unit (GPGPU) and compared to the usual approach based on Cholesky factorization. The assessment with simulated observations and data from an actual PolSAR sensor show a speedup factor of about two when compared to the usual Cholesky factorization. Moreover, the expressions provided here can be implemented in any platform.

IVJun 20, 2023
Low-complexity Multidimensional DCT Approximations

V. A. Coutinho, R. J. Cintra, F. M. Bayer

In this paper, we introduce low-complexity multidimensional discrete cosine transform (DCT) approximations. Three dimensional DCT (3D DCT) approximations are formalized in terms of high-order tensor theory. The formulation is extended to higher dimensions with arbitrary lengths. Several multiplierless $8\times 8\times 8$ approximate methods are proposed and the computational complexity is discussed for the general multidimensional case. The proposed methods complexity cost was assessed, presenting considerably lower arithmetic operations when compared with the exact 3D DCT. The proposed approximations were embedded into 3D DCT-based video coding scheme and a modified quantization step was introduced. The simulation results showed that the approximate 3D DCT coding methods offer almost identical output visual quality when compared with exact 3D DCT scheme. The proposed 3D approximations were also employed as a tool for visual tracking. The approximate 3D DCT-based proposed system performs similarly to the original exact 3D DCT-based method. In general, the suggested methods showed competitive performance at a considerably lower computational cost.

NAFeb 4, 2015
The Arithmetic Cosine Transform: Exact and Approximate Algorithms

R. J. Cintra, V. S. Dimitrov

In this paper, we introduce a new class of transform method --- the arithmetic cosine transform (ACT). We provide the central mathematical properties of the ACT, necessary in designing efficient and accurate implementations of the new transform method. The key mathematical tools used in the paper come from analytic number theory, in particular the properties of the Riemann zeta function. Additionally, we demonstrate that an exact signal interpolation is achievable for any block-length. Approximate calculations were also considered. The numerical examples provided show the potential of the ACT for various digital signal processing applications.

IVJul 29, 2022
Low-Complexity Loeffler DCT Approximations for Image and Video Coding

D. F. G. Coelho, R. J. Cintra, F. M. Bayer et al.

This paper introduced a matrix parametrization method based on the Loeffler discrete cosine transform (DCT) algorithm. As a result, a new class of eight-point DCT approximations was proposed, capable of unifying the mathematical formalism of several eight-point DCT approximations archived in the literature. Pareto-efficient DCT approximations are obtained through multicriteria optimization, where computational complexity, proximity, and coding performance are considered. Efficient approximations and their scaled 16- and 32-point versions are embedded into image and video encoders, including a JPEG-like codec and H.264/AVC and H.265/HEVC standards. Results are compared to the unmodified standard codecs. Efficient approximations are mapped and implemented on a Xilinx VLX240T FPGA and evaluated for area, speed, and power consumption.

SPJul 24, 2022
DCT Approximations Based on Chen's Factorization

C. J. Tablada, T. L. T. da Silveira, R. J. Cintra et al.

In this paper, two 8-point multiplication-free DCT approximations based on the Chen's factorization are proposed and their fast algorithms are also derived. Both transformations are assessed in terms of computational cost, error energy, and coding gain. Experiments with a JPEG-like image compression scheme are performed and results are compared with competing methods. The proposed low-complexity transforms are scaled according to Jridi-Alfalou-Meher algorithm to effect 16- and 32-point approximations. The new sets of transformations are embedded into an HEVC reference software to provide a fully HEVC-compliant video coding scheme. We show that approximate transforms can outperform traditional transforms and state-of-the-art methods at a very low complexity cost.

NAFeb 1, 2015
A Factorization Scheme for Some Discrete Hartley Transform Matrices

H. M. de Oliveira, R. J. Cintra, R. M. Campello de Souza

Discrete transforms such as the discrete Fourier transform (DFT) and the discrete Hartley transform (DHT) are important tools in numerical analysis. The successful application of transform techniques relies on the existence of efficient fast transforms. In this paper some fast algorithms are derived. The theoretical lower bound on the multiplicative complexity for the DFT/DHT are achieved. The approach is based on the factorization of DHT matrices. Algorithms for short blocklengths such as $N \in \{3, 5, 6, 12, 24 \}$ are presented.

MEAug 7, 2022
Improved Point Estimation for the Rayleigh Regression Model

B. G. Palm, F. M. Bayer, R. J. Cintra

The Rayleigh regression model was recently proposed for modeling amplitude values of synthetic aperture radar (SAR) image pixels. However, inferences from such model are based on the maximum likelihood estimators, which can be biased for small signal lengths. The Rayleigh regression model for SAR images often takes into account small pixel windows, which may lead to inaccurate results. In this letter, we introduce bias-adjusted estimators tailored for the Rayleigh regression model based on: (i) the Cox and Snell's method; (ii) the Firth's scheme; and (iii) the parametric bootstrap method. We present numerical experiments considering synthetic and actual SAR data sets. The bias-adjusted estimators yield nearly unbiased estimates and accurate modeling results.

IVJun 5, 2022
Autoregressive Model for Multi-Pass SAR Change Detection Based on Image Stacks

B. G. Palm, D. I. Alves, V. T. Vu et al.

Change detection is an important synthetic aperture radar (SAR) application, usually used to detect changes on the ground scene measurements in different moments in time. Traditionally, change detection algorithm (CDA) is mainly designed for two synthetic aperture radar (SAR) images retrieved at different instants. However, more images can be used to improve the algorithms performance, witch emerges as a research topic on SAR change detection. Image stack information can be treated as a data series over time and can be modeled by autoregressive (AR) models. Thus, we present some initial findings on SAR change detection based on image stack considering AR models. Applying AR model for each pixel position in the image stack, we obtained an estimated image of the ground scene which can be used as a reference image for CDA. The experimental results reveal that ground scene estimates by the AR models is accurate and can be used for change detection applications.

APJul 29, 2022
Robust Rayleigh Regression Method for SAR Image Processing in Presence of Outliers

B. G. Palm, F. M. Bayer, R. Machado et al.

The presence of outliers (anomalous values) in synthetic aperture radar (SAR) data and the misspecification in statistical image models may result in inaccurate inferences. To avoid such issues, the Rayleigh regression model based on a robust estimation process is proposed as a more realistic approach to model this type of data. This paper aims at obtaining Rayleigh regression model parameter estimators robust to the presence of outliers. The proposed approach considered the weighted maximum likelihood method and was submitted to numerical experiments using simulated and measured SAR images. Monte Carlo simulations were employed for the numerical assessment of the proposed robust estimator performance in finite signal lengths, their sensitivity to outliers, and the breakdown point. For instance, the non-robust estimators show a relative bias value $65$-fold larger than the results provided by the robust approach in corrupted signals. In terms of sensitivity analysis and break down point, the robust scheme resulted in a reduction of about $96\%$ and $10\%$, respectively, in the mean absolute value of both measures, in compassion to the non-robust estimators. Moreover, two SAR data sets were used to compare the ground type and anomaly detection results of the proposed robust scheme with competing methods in the literature.

NAFeb 1, 2015
Fast Finite Field Hartley Transforms Based on Hadamard Decomposition

H. M. de Oliveira, R. G. F. Távora, R. J. Cintra et al.

A new transform over finite fields, the finite field Hartley transform (FFHT), was recently introduced and a number of promising applications on the design of efficient multiple access systems and multilevel spread spectrum sequences were proposed. The FFHT exhibits interesting symmetries, which are exploited to derive tailored fast transform algorithms. The proposed fast algorithms are based on successive decompositions of the FFHT by means of Hadamard-Walsh transforms (HWT). The introduced decompositions meet the lower bound on the multiplicative complexity for all the cases investigated. The complexity of the new algorithms is compared with that of traditional algorithms.

CAJan 17, 2018
A Kotel'nikov Representation for Wavelets

H. M. de Oliveira, R. J. Cintra, R. C. de Oliveira

This paper presents a wavelet representation using baseband signals, by exploiting Kotel'nikov results. Details of how to obtain the processes of envelope and phase at low frequency are shown. The archetypal interpretation of wavelets as an analysis with a filter bank of constant quality factor is revisited on these bases. It is shown that if the wavelet spectral support is limited into the band $[f_m,f_M]$, then an orthogonal analysis is guaranteed provided that $f_M \leq 3f_m$, a quite simple result, but that invokes some parallel with the Nyquist rate. Nevertheless, in cases of orthogonal wavelets whose spectrum does not verify this condition, it is shown how to construct an "equivalent" filter bank with no spectral overlapping.

MEJul 24, 2022
Prediction Intervals in the Beta Autoregressive Moving Average Model

B. G. Palm, F. M. Bayer, R. J. Cintra

In this paper, we propose five prediction intervals for the beta autoregressive moving average model. This model is suitable for modeling and forecasting variables that assume values in the interval $(0,1)$. Two of the proposed prediction intervals are based on approximations considering the normal distribution and the quantile function of the beta distribution. We also consider bootstrap-based prediction intervals, namely: (i) bootstrap prediction errors (BPE) interval; (ii) bias-corrected and acceleration (BCa) prediction interval; and (iii) percentile prediction interval based on the quantiles of the bootstrap-predicted values for two different bootstrapping schemes. The proposed prediction intervals were evaluated according to Monte Carlo simulations. The BCa prediction interval offered the best performance among the evaluated intervals, showing lower coverage rate distortion and small average length. We applied our methodology for predicting the water level of the Cantareira water supply system in São Paulo, Brazil.

IVOct 20, 2024
Extensions on low-complexity DCT approximations for larger blocklengths based on minimal angle similarity

A. P. Radünz, L. Portella, R. S. Oliveira et al.

The discrete cosine transform (DCT) is a central tool for image and video coding because it can be related to the Karhunen-Loève transform (KLT), which is the optimal transform in terms of retained transform coefficients and data decorrelation. In this paper, we introduce 16-, 32-, and 64-point low-complexity DCT approximations by minimizing individually the angle between the rows of the exact DCT matrix and the matrix induced by the approximate transforms. According to some classical figures of merit, the proposed transforms outperformed the approximations for the DCT already known in the literature. Fast algorithms were also developed for the low-complexity transforms, asserting a good balance between the performance and its computational cost. Practical applications in image encoding showed the relevance of the transforms in this context. In fact, the experiments showed that the proposed transforms had better results than the known approximations in the literature for the cases of 16, 32, and 64 blocklength.

SPOct 11, 2024
Fast Data-independent KLT Approximations Based on Integer Functions

A. P. Radünz, D. F. G. Coelho, F. M. Bayer et al.

The Karhunen-Loève transform (KLT) stands as a well-established discrete transform, demonstrating optimal characteristics in data decorrelation and dimensionality reduction. Its ability to condense energy compression into a select few main components has rendered it instrumental in various applications within image compression frameworks. However, computing the KLT depends on the covariance matrix of the input data, which makes it difficult to develop fast algorithms for its implementation. Approximations for the KLT, utilizing specific rounding functions, have been introduced to reduce its computational complexity. Therefore, our paper introduces a category of low-complexity, data-independent KLT approximations, employing a range of round-off functions. The design methodology of the approximate transform is defined for any block-length $N$, but emphasis is given to transforms of $N = 8$ due to its wide use in image and video compression. The proposed transforms perform well when compared to the exact KLT and approximations considering classical performance measures. For particular scenarios, our proposed transforms demonstrated superior performance when compared to KLT approximations documented in the literature. We also developed fast algorithms for the proposed transforms, further reducing the arithmetic cost associated with their implementation. Evaluation of field programmable gate array (FPGA) hardware implementation metrics was conducted. Practical applications in image encoding showed the relevance of the proposed transforms. In fact, we showed that one of the proposed transforms outperformed the exact KLT given certain compression ratios.

IVNov 28, 2021
Low-complexity Rounded KLT Approximation for Image Compression

A. P. Radünz, F. M. Bayer, R. J. Cintra

The Karhunen-Loève transform (KLT) is often used for data decorrelation and dimensionality reduction. Because its computation depends on the matrix of covariances of the input signal, the use of the KLT in real-time applications is severely constrained by the difficulty in developing fast algorithms to implement it. In this context, this paper proposes a new class of low-complexity transforms that are obtained through the application of the round function to the elements of the KLT matrix. The proposed transforms are evaluated considering figures of merit that measure the coding power and distance of the proposed approximations to the exact KLT and are also explored in image compression experiments. Fast algorithms are introduced for the proposed approximate transforms. It was shown that the proposed transforms perform well in image compression and require a low implementation cost.

IVNov 28, 2021
Data-independent Low-complexity KLT Approximations for Image and Video Coding

A. P. Radünz, T. L. T. da Silveira, F. M. Bayer et al.

The Karhunen-Loève transform (KLT) is often used for data decorrelation and dimensionality reduction. The KLT is able to optimally retain the signal energy in only few transform components, being mathematically suitable for image and video compression. However, in practice, because of its high computational cost and dependence on the input signal, its application in real-time scenarios is precluded. This work proposes low-computational cost approximations for the KLT. We focus on the blocklengths $N \in \{4, 8, 16, 32 \}$ because they are widely employed in image and video coding standards such as JPEG and high efficiency video coding (HEVC). Extensive computational experiments demonstrate the suitability of the proposed low-complexity transforms for image and video compression.

SPAug 4, 2021
Low-complexity Scaling Methods for DCT-II Approximations

D. F. G. Coelho, R. J. Cintra, A. Madanayake et al.

This paper introduces a collection of scaling methods for generating $2N$-point DCT-II approximations based on $N$-point low-complexity transformations. Such scaling is based on the Hou recursive matrix factorization of the exact $2N$-point DCT-II matrix. Encompassing the widely employed Jridi-Alfalou-Meher scaling method, the proposed techniques are shown to produce DCT-II approximations that outperform the transforms resulting from the JAM scaling method according to total error energy and mean squared error. Orthogonality conditions are derived and an extensive error analysis based on statistical simulation demonstrates the good performance of the introduced scaling methods. A hardware implementation is also provided demonstrating the competitiveness of the proposed methods when compared to the JAM scaling method.

SPAug 7, 2020
Rounded Hartley Transform: A Quasi-involution

R. J. Cintra, H. M. de Oliveira, C. O. Cintra

A new multiplication-free transform derived from DHT is introduced: the RHT. Investigations on the properties of the RHT led us to the concept of weak-inversion. Using new constructs, we show that RHT is not involutional like the DHT, but exhibits quasi-involutional property, a new definition derived from the periodicity of matrices. Thus instead of using the actual inverse transform, the RHT is viewed as an involutional transform, allowing the use of direct (multiplication-free) to evaluate the inverse. A fast algorithm to compute RHT is presented. This algorithm show embedded properties. We also extended RHT to the two-dimensional case. This permitted us to perform a preliminary analysis on the effects of RHT on images. Despite of some SNR loss, RHT can be very interesting for applications involving image monitoring associated to decision making, such as military applications or medical imaging.

SPJul 5, 2020
An Integer Approximation Method for Discrete Sinusoidal Transforms

R. J. Cintra

Approximate methods have been considered as a means to the evaluation of discrete transforms. In this work, we propose and analyze a class of integer transforms for the discrete Fourier, Hartley, and cosine transforms (DFT, DHT, and DCT), based on simple dyadic rational approximation methods. The introduced method is general, applicable to several block-lengths, whereas existing approaches are usually dedicated to specific transform sizes. The suggested approximate transforms enjoy low multiplicative complexity and the orthogonality property is achievable via matrix polar decomposition. We show that the obtained transforms are competitive with archived methods in literature. New 8-point square wave approximate transforms for the DFT, DHT, and DCT are also introduced as particular cases of the introduced methodology.

SPJun 19, 2020
A Multiparametric Class of Low-complexity Transforms for Image and Video Coding

D. R. Canterle, T. L. T. da Silveira, F. M. Bayer et al.

Discrete transforms play an important role in many signal processing applications, and low-complexity alternatives for classical transforms became popular in recent years. Particularly, the discrete cosine transform (DCT) has proven to be convenient for data compression, being employed in well-known image and video coding standards such as JPEG, H.264, and the recent high efficiency video coding (HEVC). In this paper, we introduce a new class of low-complexity 8-point DCT approximations based on a series of works published by Bouguezel, Ahmed and Swamy. Also, a multiparametric fast algorithm that encompasses both known and novel transforms is derived. We select the best-performing DCT approximations after solving a multicriteria optimization problem, and submit them to a scaling method for obtaining larger size transforms. We assess these DCT approximations in both JPEG-like image compression and video coding experiments. We show that the optimal DCT approximations present compelling results in terms of coding efficiency and image quality metrics, and require only few addition or bit-shifting operations, being suitable for low-complexity and low-power systems.

SPJun 14, 2020
Multidimensional Wavelets for Scalable Image Decomposition: Orbital Wavelets

H. M. de Oliveira, V. V. Vermehren, R. J. Cintra

Wavelets are closely related to the Schrödinger's wave functions and the interpretation of Born. Similarly to the appearance of atomic orbital, it is proposed to combine anti-symmetric wavelets into orbital wavelets. The proposed approach allows the increase of the dimension of wavelets through this process. New orbital 2D-wavelets are introduced for the decomposition of still images, showing that it is possible to perform an analysis simultaneous in two distinct scales. An example of such an image analysis is shown.

IVAug 8, 2018
Low-complexity 8-point DCT Approximation Based on Angle Similarity for Image and Video Coding

R. S. Oliveira, R. J. Cintra, F. M. Bayer et al.

The principal component analysis (PCA) is widely used for data decorrelation and dimensionality reduction. However, the use of PCA may be impractical in real-time applications, or in situations were energy and computing constraints are severe. In this context, the discrete cosine transform (DCT) becomes a low-cost alternative to data decorrelation. This paper presents a method to derive computationally efficient approximations to the DCT. The proposed method aims at the minimization of the angle between the rows of the exact DCT matrix and the rows of the approximated transformation matrix. The resulting transformations matrices are orthogonal and have extremely low arithmetic complexity. Considering popular performance measures, one of the proposed transformation matrices outperforms the best competitors in both matrix error and coding capabilities. Practical applications in image and video coding demonstrate the relevance of the proposed transformation. In fact, we show that the proposed approximate DCT can outperform the exact DCT for image encoding under certain compression ratios. The proposed transform and its direct competitors are also physically realized as digital prototype circuits using FPGA technology.

CRJan 25, 2018
A New Algorithm for Double Scalar Multiplication over Koblitz Curves

J. Adikari, V. S. Dimitrov, R. J. Cintra

Koblitz curves are a special set of elliptic curves and have improved performance in computing scalar multiplication in elliptic curve cryptography due to the Frobenius endomorphism. Double-base number system approach for Frobenius expansion has improved the performance in single scalar multiplication. In this paper, we present a new algorithm to generate a sparse and joint $τ$-adic representation for a pair of scalars and its application in double scalar multiplication. The new algorithm is inspired from double-base number system. We achieve 12% improvement in speed against state-of-the-art $τ$-adic joint sparse form.

AROct 30, 2017
VLSI Computational Architectures for the Arithmetic Cosine Transform

N. Rajapaksha, A. Madanayake, R. J. Cintra et al.

The discrete cosine transform (DCT) is a widely-used and important signal processing tool employed in a plethora of applications. Typical fast algorithms for nearly-exact computation of DCT require floating point arithmetic, are multiplier intensive, and accumulate round-off errors. Recently proposed fast algorithm arithmetic cosine transform (ACT) calculates the DCT exactly using only additions and integer constant multiplications, with very low area complexity, for null mean input sequences. The ACT can also be computed non-exactly for any input sequence, with low area complexity and low power consumption, utilizing the novel architecture described. However, as a trade-off, the ACT algorithm requires 10 non-uniformly sampled data points to calculate the 8-point DCT. This requirement can easily be satisfied for applications dealing with spatial signals such as image sensors and biomedical sensor arrays, by placing sensor elements in a non-uniform grid. In this work, a hardware architecture for the computation of the null mean ACT is proposed, followed by a novel architectures that extend the ACT for non-null mean signals. All circuits are physically implemented and tested using the Xilinx XC6VLX240T FPGA device and synthesized for 45 nm TSMC standard-cell library for performance assessment.

AROct 27, 2017
A Single-Channel Architecture for Algebraic Integer Based 8$\times$8 2-D DCT Computation

A. Edirisuriya, A. Madanayake, R. J. Cintra et al.

An area efficient row-parallel architecture is proposed for the real-time implementation of bivariate algebraic integer (AI) encoded 2-D discrete cosine transform (DCT) for image and video processing. The proposed architecture computes 8$\times$8 2-D DCT transform based on the Arai DCT algorithm. An improved fast algorithm for AI based 1-D DCT computation is proposed along with a single channel 2-D DCT architecture. The design improves on the 4-channel AI DCT architecture that was published recently by reducing the number of integer channels to one and the number of 8-point 1-D DCT cores from 5 down to 2. The architecture offers exact computation of 8$\times$8 blocks of the 2-D DCT coefficients up to the FRS, which converts the coefficients from the AI representation to fixed-point format using the method of expansion factors. Prototype circuits corresponding to FRS blocks based on two expansion factors are realized, tested, and verified on FPGA-chip, using a Xilinx Virtex-6 XC6VLX240T device. Post place-and-route results show a 20% reduction in terms of area compared to the 2-D DCT architecture requiring five 1-D AI cores. The area-time and area-time${}^2$ complexity metrics are also reduced by 23% and 22% respectively for designs with 8-bit input word length. The digital realizations are simulated up to place and route for ASICs using 45 nm CMOS standard cells. The maximum estimated clock rate is 951 MHz for the CMOS realizations indicating 7.608$\cdot$10$^9$ pixels/seconds and a 8$\times$8 block rate of 118.875 MHz.

MMFeb 6, 2017
A Digital Hardware Fast Algorithm and FPGA-based Prototype for a Novel 16-point Approximate DCT for Image Compression Applications

F. M. Bayer, R. J. Cintra, A. Edirisuriya et al.

The discrete cosine transform (DCT) is the key step in many image and video coding standards. The 8-point DCT is an important special case, possessing several low-complexity approximations widely investigated. However, 16-point DCT transform has energy compaction advantages. In this sense, this paper presents a new 16-point DCT approximation with null multiplicative complexity. The proposed transform matrix is orthogonal and contains only zeros and ones. The proposed transform outperforms the well-know Walsh-Hadamard transform and the current state-of-the-art 16-point approximation. A fast algorithm for the proposed transform is also introduced. This fast algorithm is experimentally validated using hardware implementations that are physically realized and verified on a 40 nm CMOS Xilinx Virtex-6 XC6VLX240T FPGA chip for a maximum clock rate of 342 MHz. Rapid prototypes on FPGA for 8-bit input word size shows significant improvement in compressed image quality by up to 1-2 dB at the cost of only eight adders compared to the state-of-art 16-point DCT approximation algorithm in the literature [S. Bouguezel, M. O. Ahmad, and M. N. S. Swamy. A novel transform for image compression. In {\em Proceedings of the 53rd IEEE International Midwest Symposium on Circuits and Systems (MWSCAS)}, 2010].

MMFeb 2, 2017
DCT-like Transform for Image Compression Requires 14 Additions Only

F. M. Bayer, R. J. Cintra

A low-complexity 8-point orthogonal approximate DCT is introduced. The proposed transform requires no multiplications or bit-shift operations. The derived fast algorithm requires only 14 additions, less than any existing DCT approximation. Moreover, in several image compression scenarios, the proposed transform could outperform the well-known signed DCT, as well as state-of-the-art algorithms.

MMDec 11, 2016
Low-complexity Pruned 8-point DCT Approximations for Image Encoding

V. A. Coutinho, R. J. Cintra, F. M. Bayer et al.

Two multiplierless pruned 8-point discrete cosine transform (DCT) approximation are presented. Both transforms present lower arithmetic complexity than state-of-the-art methods. The performance of such new methods was assessed in the image compression context. A JPEG-like simulation was performed, demonstrating the adequateness and competitiveness of the introduced methods. Digital VLSI implementation in CMOS technology was also considered. Both presented methods were realized in Berkeley Emulation Engine (BEE3).

MMDec 2, 2016
Energy-efficient 8-point DCT Approximations: Theory and Hardware Architectures

R. J. Cintra, F. M. Bayer, V. A. Coutinho et al.

Due to its remarkable energy compaction properties, the discrete cosine transform (DCT) is employed in a multitude of compression standards, such as JPEG and H.265/HEVC. Several low-complexity integer approximations for the DCT have been proposed for both 1-D and 2-D signal analysis. The increasing demand for low-complexity, energy efficient methods require algorithms with even lower computational costs. In this paper, new 8-point DCT approximations with very low arithmetic complexity are presented. The new transforms are proposed based on pruning state-of-the-art DCT approximations. The proposed algorithms were assessed in terms of arithmetic complexity, energy retention capability, and image compression performance. In addition, a metric combining performance and computational complexity measures was proposed. Results showed good performance and extremely low computational complexity. Introduced algorithms were mapped into systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45nm CMOS technology. All hardware-related metrics showed low resource consumption of the proposed pruned approximate transforms. The best proposed transform according to the introduced metric presents a reduction in power consumption of 21--25%.

MMSep 24, 2016
Low-complexity Image and Video Coding Based on an Approximate Discrete Tchebichef Transform

P. A. M. Oliveira, R. J. Cintra, F. M. Bayer et al.

The usage of linear transformations has great relevance for data decorrelation applications, like image and video compression. In that sense, the discrete Tchebichef transform (DTT) possesses useful coding and decorrelation properties. The DTT transform kernel does not depend on the input data and fast algorithms can be developed to real time applications. However, the DTT fast algorithm presented in literature possess high computational complexity. In this work, we introduce a new low-complexity approximation for the DTT. The fast algorithm of the proposed transform is multiplication-free and requires a reduced number of additions and bit-shifting operations. Image and video compression simulations in popular standards shows good performance of the proposed transform. Regarding hardware resource consumption for FPGA shows 43.1% reduction of configurable logic blocks and ASIC place and route realization shows 57.7% reduction in the area-time figure when compared with the 2-D version of the exact DTT.

CVJun 23, 2016
Multiplierless 16-point DCT Approximation for Low-complexity Image and Video Coding

T. L. T. Silveira, R. S. Oliveira, F. M. Bayer et al.

An orthogonal 16-point approximate discrete cosine transform (DCT) is introduced. The proposed transform requires neither multiplications nor bit-shifting operations. A fast algorithm based on matrix factorization is introduced, requiring only 44 additions---the lowest arithmetic cost in literature. To assess the introduced transform, computational complexity, similarity with the exact DCT, and coding performance measures are computed. Classical and state-of-the-art 16-point low-complexity transforms were used in a comparative analysis. In the context of image compression, the proposed approximation was evaluated via PSNR and SSIM measurements, attaining the best cost-benefit ratio among the competitors. For video encoding, the proposed approximation was embedded into a HEVC reference software for direct comparison with the original HEVC standard. Physically realized and tested using FPGA hardware, the proposed transform showed 35% and 37% improvements of area-time and area-time-squared VLSI metrics when compared to the best competing transform in the literature.

MMJan 27, 2016
Robust Image Watermarking Using Non-Regular Wavelets

R. J. Cintra, T. V. Cooklev

An approach to watermarking digital images using non-regular wavelets is advanced. Non-regular transforms spread the energy in the transform domain. The proposed method leads at the same time to increased image quality and increased robustness with respect to lossy compression. The approach provides robust watermarking by suitably creating watermarked messages that have energy compaction and frequency spreading. Our experimental results show that the application of non-regular wavelets, instead of regular ones, can furnish a superior robust watermarking scheme. The generated watermarked data is more immune against non-intentional JPEG and JPEG2000 attacks.

MEFeb 2, 2015
A Class of DCT Approximations Based on the Feig-Winograd Algorithm

C. J. Tablada, F. M. Bayer, R. J. Cintra

A new class of matrices based on a parametrization of the Feig-Winograd factorization of 8-point DCT is proposed. Such parametrization induces a matrix subspace, which unifies a number of existing methods for DCT approximation. By solving a comprehensive multicriteria optimization problem, we identified several new DCT approximations. Obtained solutions were sought to possess the following properties: (i) low multiplierless computational complexity, (ii) orthogonality or near orthogonality, (iii) low complexity invertibility, and (iv) close proximity and performance to the exact DCT. Proposed approximations were submitted to assessment in terms of proximity to the DCT, coding performance, and suitability for image compression. Considering Pareto efficiency, particular new proposed approximations could outperform various existing methods archived in literature.

MMFeb 1, 2015
Fragile Watermarking Using Finite Field Trigonometrical Transforms

R. J. Cintra, V. S. Dimitrov, H. M. de Oliveira et al.

Fragile digital watermarking has been applied for authentication and alteration detection in images. Utilizing the cosine and Hartley transforms over finite fields, a new transform domain fragile watermarking scheme is introduced. A watermark is embedded into a host image via a blockwise application of two-dimensional finite field cosine or Hartley transforms. Additionally, the considered finite field transforms are adjusted to be number theoretic transforms, appropriate for error-free calculation. The employed technique can provide invisible fragile watermarking for authentication systems with tamper location capability. It is shown that the choice of the finite field characteristic is pivotal to obtain perceptually invisible watermarked images. It is also shown that the generated watermarked images can be used as publicly available signature data for authentication purposes.

MEJan 28, 2015
A Discrete Tchebichef Transform Approximation for Image and Video Coding

P. A. M. Oliveira, R. J. Cintra, F. M. Bayer et al.

In this paper, we introduce a low-complexity approximation for the discrete Tchebichef transform (DTT). The proposed forward and inverse transforms are multiplication-free and require a reduced number of additions and bit-shifting operations. Numerical compression simulations demonstrate the efficiency of the proposed transform for image and video coding. Furthermore, Xilinx Virtex-6 FPGA based hardware realization shows 44.9% reduction in dynamic power consumption and 64.7% lower area when compared to the literature.

CAApr 23, 2015
Wavelets for Elliptical Waveguide Problems

M. M. S. Lira, H. M. de Oliveira, R. J. Cintra et al.

New elliptic cylindrical wavelets are introduced, which exploit the relationship between analysing filters and Floquet's solution of Mathieu differential equations. It is shown that the transfer function of both multiresolution filters is related to the solution of a Mathieu equation of odd characteristic exponent. The number of notches of these analysing filters can be easily designed. Wavelets derived by this method have potential application in the fields of optics, microwaves and electromagnetism.

CAApr 23, 2015
A Short Survey on Arithmetic Transforms and the Arithmetic Hartley Transform

R. J. Cintra, H. M. de Oliveira

Arithmetic complexity has a main role in the performance of algorithms for spectrum evaluation. Arithmetic transform theory offers a method for computing trigonometrical transforms with minimal number of multiplications. In this paper, the proposed algorithms for the arithmetic Fourier transform are surveyed. A new arithmetic transform for computing the discrete Hartley transform is introduced: the Arithmetic Hartley transform. The interpolation process is shown to be the key element of the arithmetic transform theory.

MMJan 13, 2015
Improved 8-point Approximate DCT for Image and Video Compression Requiring Only 14 Additions

U. S. Potluri, A. Madanayake, R. J. Cintra et al.

Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer superior compression performance at very low circuit complexity. Such approximations can be realized in digital VLSI hardware using additions and subtractions only, leading to significant reductions in chip area and power consumption compared to conventional DCTs and integer transforms. In this paper, we introduce a novel 8-point DCT approximation that requires only 14 addition operations and no multiplications. The proposed transform possesses low computational complexity and is compared to state-of-the-art DCT approximations in terms of both algorithm complexity and peak signal-to-noise ratio. The proposed DCT approximation is a candidate for reconfigurable video standards such as HEVC. The proposed transform and several other DCT approximations are mapped to systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45 nm CMOS technology.

MENov 10, 2014
On Filter Banks and Wavelets Based on Chebyshev Polynomials

R. J. Cintra, H. M. de Oliveira, L. R. Soares

In this paper we introduce a new family of wavelets, named Chebyshev wavelets, which are derived from conventional first and second kind Chebyshev polynomials. Properties of Chebyshev filter banks are investigated, including orthogonality and perfect reconstruction conditions. Chebyshev wavelets have compact support, their filters possess good selectivity, but they are not orthogonal. The convergence of the cascade algorithm of Chebyshev wavelets is proved by using properties of Markov chains. Computational implementation of these wavelets and some clear-cut applications are presented. Proposed wavelets are suitable for signal denoising.

ARMay 2, 2014
Multiplierless Approximate 4-point DCT VLSI Architectures for Transform Block Coding

F. M. Bayer, R. J. Cintra, A. Madanayake et al.

Two multiplierless algorithms are proposed for 4x4 approximate-DCT for transform coding in digital video. Computational architectures for 1-D/2-D realisations are implemented using Xilinx FPGA devices. CMOS synthesis at the 45 nm node indicate real-time operation at 1 GHz yielding 4x4 block rates of 125 MHz at less than 120 mW of dynamic power consumption.

MMFeb 25, 2014
A DCT Approximation for Image Compression

R. J. Cintra, F. M. Bayer

An orthogonal approximation for the 8-point discrete cosine transform (DCT) is introduced. The proposed transformation matrix contains only zeros and ones; multiplications and bit-shift operations are absent. Close spectral behavior relative to the DCT was adopted as design criterion. The proposed algorithm is superior to the signed discrete cosine transform. It could also outperform state-of-the-art algorithms in low and high image compression scenarios, exhibiting at the same time a comparable computational complexity.

MMFeb 24, 2014
A Multiplierless Pruned DCT-like Transformation for Image and Video Compression that Requires 10 Additions Only

V. A. Coutinho, R. J. Cintra, F. M. Bayer et al.

A multiplierless pruned approximate 8-point discrete cosine transform (DCT) requiring only 10 additions is introduced. The proposed algorithm was assessed in image and video compression, showing competitive performance with state-of-the-art methods. Digital implementation in 45 nm CMOS technology up to place-and-route level indicates clock speed of 288 MHz at a 1.1 V supply. The 8x8 block rate is 36 MHz.The DCT approximation was embedded into HEVC reference software; resulting video frames, at up to 327 Hz for 8-bit RGB HEVC, presented negligible image degradation.