Robert Calderbank

CV
19papers
371citations
Novelty47%
AI Score47

19 Papers

13.0ITApr 18
Zak-OTFS: A Predictable Physical Layer for Communications and Sensing

Sandesh Rao Mattu, Nishant Mehrotra, Venkatesh Khammammetti et al.

This tutorial derives the mathematical foundations of what it means for a carrier waveform to be predictable and non-selective. We focus on Zak-OTFS, where each carrier waveform is a pulse in the delay-Doppler (DD) domain, formally a quasi-periodic localized function with specific periods along delay and Doppler. Viewed in the time domain, the Zak-OTFS carrier is realized as a pulse train modulated by a tone (termed a pulsone). We start by providing physical intuition, describing what it means for the Zak-OTFS carrier waveforms to be geometric modes of the Heisenberg-Weyl (HW) group of discrete delay and Doppler shifts that define the discrete-time communication model. In fact, we show that these geometric modes are common eigenvectors of a maximal commutative subgroup of our discrete HW group. When the channel delay spread is less than the delay period, and the channel Doppler spread is less than the Doppler period, we show that the Zak-OTFS input-output (I/O) relation is predictable and non-selective. Given the I/O response at one DD point in a frame, it is possible to predict the I/O response at all other points, without recourse to some mathematical model of the channel. While it may be intuitive that geometric modes of the HW group are predictable and non-selective wireless carriers, this is not a requirement. We provide a necessary and sufficient condition that depends on the ambiguity properties of the basis of carrier waveforms. In fact, we show that the structure of a pulse train modulated by a Hadamard matrix is common to several families of waveforms proposed for 6G, including Zak-OTFS, AFDM, OTSM and ODDM.

66.0SPApr 8
Delay-Doppler Channel Estimation using Arbitrarily Modulated Data Transmissions

Nishant Mehrotra, Sandesh Rao Mattu, Robert Calderbank

Conventional delay-Doppler (DD) communication and sensing systems require transmitting pilot frames at every channel coherence time interval in order to keep track of channel variations at the cost of spectral efficiency. In this paper, we propose an approach to utilize data transmissions modulated using arbitrary waveforms for DD channel estimation without requiring pilot transmissions in every coherence time interval. Numerical evaluation over practical doubly-selective channel models demonstrate $\sim 1.8 \times$ improvement in spectral efficiency with our proposed data-based approach over conventional pilot-based approaches across various 6G modulation schemes.

72.9SPApr 2
Real-Time and Scalable Zak-OTFS Receiver Processing on GPUs

Junyao Zheng, Chung-Hsuan Tung, Yuncheng Yao et al.

Orthogonal time frequency space (OTFS) modulation offers superior robustness to high-mobility channels compared to conventional orthogonal frequency-division multiplexing (OFDM) waveforms. However, its explicit delay-Doppler (DD) domain representation incurs substantial signal processing complexity, especially with increased DD domain grid sizes. To address this challenge, we present a scalable, real-time Zak-OTFS receiver architecture on GPUs through hardware--algorithm co-design that exploits DD-domain channel sparsity. Our design leverages compact matrix operations for key processing stages, a branchless iterative equalizer, and a structured sparse channel matrix of the DD domain channel matrix to significantly reduce computational and memory overhead. These optimizations enable low-latency processing that consistently meets the 99.9-th percentile real-time processing deadline. The proposed system achieves up to 906.52 Mbps throughput with a DD grid size of (16384,32) using 16QAM modulation over 245.76 MHz bandwidth. Extensive evaluations under a Vehicular-A channel model demonstrate strong scalability and robust performance across CPU (Intel Xeon) and multiple GPU platforms (NVIDIA Jetson Orin, RTX 6000 Ada, A100, and H200), highlighting the effectiveness of compute-aware Zak-OTFS receiver design for next-generation (NextG) high-mobility communication systems.

LGSep 24, 2019
Scaling-Translation-Equivariant Networks with Decomposed Convolutional Filters

Wei Zhu, Qiang Qiu, Robert Calderbank et al.

Encoding the scale information explicitly into the representation learned by a convolutional neural network (CNN) is beneficial for many computer vision tasks especially when dealing with multiscale inputs. We study, in this paper, a scaling-translation-equivariant (ST-equivariant) CNN with joint convolutions across the space and the scaling group, which is shown to be both sufficient and necessary to achieve equivariance for the regular representation of the scaling-translation group ST . To reduce the model complexity and computational burden, we decompose the convolutional filters under two pre-fixed separable bases and truncate the expansion to low-frequency components. A further benefit of the truncated filter expansion is the improved deformation robustness of the equivariant representation, a property which is theoretically analyzed and empirically verified. Numerical experiments demonstrate that the proposed scaling-translation-equivariant network with decomposed convolutional filters (ScDCFNet) achieves significantly improved performance in multiscale image classification and better interpretability than regular CNNs at a reduced model size.

LGJan 29, 2019
Geometric Matrix Completion with Deep Conditional Random Fields

Duc Minh Nguyen, Robert Calderbank, Nikos Deligiannis

The problem of completing high-dimensional matrices from a limited set of observations arises in many big data applications, especially, recommender systems. Existing matrix completion models generally follow either a memory- or a model-based approach, whereas, geometric matrix completion models combine the best from both approaches. Existing deep-learning-based geometric models yield good performance, but, in order to operate, they require a fixed structure graph capturing the relationships among the users and items. This graph is typically constructed by evaluating a pre-defined similarity metric on the available observations or by using side information, e.g., user profiles. In contrast, Markov-random-fields-based models do not require a fixed structure graph but rely on handcrafted features to make predictions. When no side information is available and the number of available observations becomes very low, existing solutions are pushed to their limits. In this paper, we propose a geometric matrix completion approach that addresses these challenges. We consider matrix completion as a structured prediction problem in a conditional random field (CRF), which is characterized by a maximum a posterior (MAP) inference, and we propose a deep model that predicts the missing entries by solving the MAP inference problem. The proposed model simultaneously learns the similarities among matrix entries, computes the CRF potentials, and solves the inference problem. Its training is performed in an end-to-end manner, with a method to supervise the learning of entry similarities. Comprehensive experiments demonstrate the superior performance of the proposed model compared to various state-of-the-art models on popular benchmark datasets and underline its superior capacity to deal with highly incomplete matrices.

LGJul 4, 2018
Regularizing Autoencoder-Based Matrix Completion Models via Manifold Learning

Duc Minh Nguyen, Evaggelia Tsiligianni, Robert Calderbank et al.

Autoencoders are popular among neural-network-based matrix completion models due to their ability to retrieve potential latent factors from the partially observed matrices. Nevertheless, when training data is scarce their performance is significantly degraded due to overfitting. In this paper, we mit- igate overfitting with a data-dependent regularization technique that relies on the principles of multi-task learning. Specifically, we propose an autoencoder-based matrix completion model that performs prediction of the unknown matrix values as a main task, and manifold learning as an auxiliary task. The latter acts as an inductive bias, leading to solutions that generalize better. The proposed model outperforms the existing autoencoder-based models designed for matrix completion, achieving high reconstruction accuracy in well-known datasets.

CVMay 17, 2018
RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks

Xiuyuan Cheng, Qiang Qiu, Robert Calderbank et al.

Explicit encoding of group actions in deep features makes it possible for convolutional neural networks (CNNs) to handle global deformations of images, which is critical to success in many vision tasks. This paper proposes to decompose the convolutional filters over joint steerable bases across the space and the group geometry simultaneously, namely a rotation-equivariant CNN with decomposed convolutional filters (RotDCF). This decomposition facilitates computing the joint convolution, which is proved to be necessary for the group equivariance. It significantly reduces the model size and computational complexity while preserving performance, and truncation of the bases expansion serves implicitly to regularize the filters. On datasets involving in-plane and out-of-plane object rotations, RotDCF deep features demonstrate greater robustness and interpretability than regular CNNs. The stability of the equivariant representation to input variations is also proved theoretically under generic assumptions on the filters in the decomposed form. The RotDCF framework can be extended to groups other than rotations, providing a general approach which achieves both group equivariance and representation stability at a reduced model size.

MLFeb 12, 2018
DCFNet: Deep Neural Network with Decomposed Convolutional Filters

Qiang Qiu, Xiuyuan Cheng, Robert Calderbank et al.

Filters in a Convolutional Neural Network (CNN) contain model parameters learned from enormous amounts of data. In this paper, we suggest to decompose convolutional filters in CNN as a truncated expansion with pre-fixed bases, namely the Decomposed Convolutional Filters network (DCFNet), where the expansion coefficients remain learned from data. Such a structure not only reduces the number of trainable parameters and computation, but also imposes filter regularity by bases truncation. Through extensive experiments, we consistently observe that DCFNet maintains accuracy for image classification tasks with a significant reduction of model parameters, particularly with Fourier-Bessel (FB) bases, and even with random bases. Theoretically, we analyze the representation stability of DCFNet with respect to input variations, and prove representation stability under generic assumptions on the expansion coefficients. The analysis is consistent with the empirical observations.

CVNov 16, 2017
LDMNet: Low Dimensional Manifold Regularized Neural Networks

Wei Zhu, Qiang Qiu, Jiaji Huang et al.

Deep neural networks have proved very successful on archetypal tasks for which large training sets are available, but when the training data are scarce, their performance suffers from overfitting. Many existing methods of reducing overfitting are data-independent, and their efficacy is often limited when the training set is very small. Data-dependent regularizations are mostly motivated by the observation that data of interest lie close to a manifold, which is typically hard to parametrize explicitly and often requires human input of tangent vectors. These methods typically only focus on the geometry of the input data, and do not necessarily encourage the networks to produce geometrically meaningful features. To resolve this, we propose a new framework, the Low-Dimensional-Manifold-regularized neural Network (LDMNet), which incorporates a feature regularization method that focuses on the geometry of both the input data and the output features. In LDMNet, we regularize the network by encouraging the combination of the input data and the output features to sample a collection of low dimensional manifolds, which are searched efficiently without explicit parametrization. To achieve this, we directly use the manifold dimension as a regularization term in a variational functional. The resulting Euler-Lagrange equation is a Laplace-Beltrami equation over a point cloud, which is solved by the point integral method without increasing the computational complexity. We demonstrate two benefits of LDMNet in the experiments. First, we show that LDMNet significantly outperforms widely-used network regularizers such as weight decay and DropOut. Second, we show that LDMNet can be designed to extract common features of an object imaged via different modalities, which proves to be very useful in real-world applications such as cross-spectral face recognition.

ITJul 11, 2016
Bounds on the Number of Measurements for Reliable Compressive Classification

Hugo Reboredo, Francesco Renna, Robert Calderbank et al.

This paper studies the classification of high-dimensional Gaussian signals from low-dimensional noisy, linear measurements. In particular, it provides upper bounds (sufficient conditions) on the number of measurements required to drive the probability of misclassification to zero in the low-noise regime, both for random measurements and designed ones. Such bounds reveal two important operational regimes that are a function of the characteristics of the source: i) when the number of classes is less than or equal to the dimension of the space spanned by signals in each class, reliable classification is possible in the low-noise regime by using a one-vs-all measurement design; ii) when the dimension of the spaces spanned by signals in each class is lower than the number of classes, reliable classification is guaranteed in the low-noise regime by using a simple random measurement design. Simulation results both with synthetic and real data show that our analysis is sharp, in the sense that it is able to gauge the number of measurements required to drive the misclassification probability to zero in the low-noise regime.

ITMay 8, 2016
Rate-Distortion Bounds on Bayes Risk in Supervised Learning

Matthew Nokleby, Ahmad Beirami, Robert Calderbank

We present an information-theoretic framework for bounding the number of labeled samples needed to train a classifier in a parametric Bayesian setting. We derive bounds on the average $L_p$ distance between the learned classifier and the true maximum a posteriori classifier, which are well-established surrogates for the excess classification error due to imperfect learning. We provide lower and upper bounds on the rate-distortion function, using $L_p$ loss as the distortion measure, of a maximum a priori classifier in terms of the differential entropy of the posterior distribution and a quantity called the interpolation dimension, which characterizes the complexity of the parametric distribution family. In addition to expressing the information content of a classifier in terms of lossy compression, the rate-distortion function also expresses the minimum number of bits a learning machine needs to extract from training data to learn a classifier to within a specified $L_p$ tolerance. We use results from universal source coding to express the information content in the training data in terms of the Fisher information of the parametric family and the number of training samples available. The result is a framework for computing lower bounds on the Bayes $L_p$ risk. This framework complements the well-known probably approximately correct (PAC) framework, which provides minimax risk bounds involving the Vapnik-Chervonenkis dimension or Rademacher complexity. Whereas the PAC framework provides upper bounds the risk for the worst-case data distribution, the proposed rate-distortion framework lower bounds the risk averaged over the data distribution. We evaluate the bounds for a variety of data models, including categorical, multinomial, and Gaussian models. In each case the bounds are provably tight orderwise, and in two cases we prove that the bounds are tight up to multiplicative constants.

CVDec 21, 2015
GraphConnect: A Regularization Framework for Neural Networks

Jiaji Huang, Qiang Qiu, Robert Calderbank et al.

Deep neural networks have proved very successful in domains where large training sets are available, but when the number of training samples is small, their performance suffers from overfitting. Prior methods of reducing overfitting such as weight decay, Dropout and DropConnect are data-independent. This paper proposes a new method, GraphConnect, that is data-dependent, and is motivated by the observation that data of interest lie close to a manifold. The new method encourages the relationships between the learned decisions to resemble a graph representing the manifold structure. Essentially GraphConnect is designed to learn attributes that are present in data samples in contrast to weight decay, Dropout and DropConnect which are simply designed to make it more difficult to fit to random error or noise. Empirical Rademacher complexity is used to connect the generalization error of the neural network to spectral properties of the graph learned from the input data. This framework is used to show that GraphConnect is superior to weight decay. Experimental results on several benchmark datasets validate the theoretical analysis, and show that when the number of training samples is small, GraphConnect is able to significantly improve performance over weight decay.

CVSep 17, 2015
Geometry-aware Deep Transform

Jiaji Huang, Qiang Qiu, Robert Calderbank et al.

Many recent efforts have been devoted to designing sophisticated deep learning structures, obtaining revolutionary results on benchmark datasets. The success of these deep learning methods mostly relies on an enormous volume of labeled training samples to learn a huge number of parameters in a network; therefore, understanding the generalization ability of a learned deep network cannot be overlooked, especially when restricted to a small training set, which is the case for many applications. In this paper, we propose a novel deep learning objective formulation that unifies both the classification and metric learning criteria. We then introduce a geometry-aware deep transform to enable a non-linear discriminative and robust feature transform, which shows competitive performance on small training sets for both synthetic and real-world data. We further support the proposed framework with a formal $(K,ε)$-robustness analysis.

ITAug 7, 2015
Mismatch in the Classification of Linear Subspaces: Sufficient Conditions for Reliable Classification

Jure Sokolic, Francesco Renna, Robert Calderbank et al.

This paper considers the classification of linear subspaces with mismatched classifiers. In particular, we assume a model where one observes signals in the presence of isotropic Gaussian noise and the distribution of the signals conditioned on a given class is Gaussian with a zero mean and a low-rank covariance matrix. We also assume that the classifier knows only a mismatched version of the parameters of input distribution in lieu of the true parameters. By constructing an asymptotic low-noise expansion of an upper bound to the error probability of such a mismatched classifier, we provide sufficient conditions for reliable classification in the low-noise regime that are able to sharply predict the absence of a classification error floor. Such conditions are a function of the geometry of the true signal distribution, the geometry of the mismatched signal distributions as well as the interplay between such geometries, namely, the principal angles and the overlap between the true and the mismatched signal subspaces. Numerical results demonstrate that our conditions for reliable classification can sharply predict the behavior of a mismatched classifier both with synthetic data and in a motion segmentation and a hand-written digit classification applications.

MLJul 15, 2015
The Role of Principal Angles in Subspace Classification

Jiaji Huang, Qiang Qiu, Robert Calderbank

Subspace models play an important role in a wide range of signal processing tasks, and this paper explores how the pairwise geometry of subspaces influences the probability of misclassification. When the mismatch between the signal and the model is vanishingly small, the probability of misclassification is determined by the product of the sines of the principal angles between subspaces. When the mismatch is more significant, the probability of misclassification is determined by the sum of the squares of the sines of the principal angles. Reliability of classification is derived in terms of the distribution of signal energy across principal vectors. Larger principal angles lead to smaller classification error, motivating a linear transform that optimizes principal angles. The transform presented here (TRAIT) preserves some specific characteristic of each individual class, and this approach is shown to be complementary to a previously developed transform (LRT) that enlarges inter-class distance while suppressing intra-class dispersion. Theoretical results are supported by demonstration of superior classification accuracy on synthetic and measured data even in the presence of significant model mismatch.

CVDec 18, 2014
Data Representation using the Weyl Transform

Qiang Qiu, Andrew Thompson, Robert Calderbank et al.

The Weyl transform is introduced as a rich framework for data representation. Transform coefficients are connected to the Walsh-Hadamard transform of multiscale autocorrelations, and different forms of dyadic periodicity in a signal are shown to appear as different features in its Weyl coefficients. The Weyl transform has a high degree of symmetry with respect to a large group of multiscale transformations, which allows compact yet discriminative representations to be obtained by pooling coefficients. The effectiveness of the Weyl transform is demonstrated through the example of textured image classification.

ITDec 1, 2014
Classification and Reconstruction of High-Dimensional Signals from Low-Dimensional Features in the Presence of Side Information

Francesco Renna, Liming Wang, Xin Yuan et al.

This paper offers a characterization of fundamental limits on the classification and reconstruction of high-dimensional signals from low-dimensional features, in the presence of side information. We consider a scenario where a decoder has access both to linear features of the signal of interest and to linear features of the side information signal; while the side information may be in a compressed form, the objective is recovery or classification of the primary signal, not the side information. The signal of interest and the side information are each assumed to have (distinct) latent discrete labels; conditioned on these two labels, the signal of interest and side information are drawn from a multivariate Gaussian distribution. With joint probabilities on the latent labels, the overall signal-(side information) representation is defined by a Gaussian mixture model. We then provide sharp sufficient and/or necessary conditions for these quantities to approach zero when the covariance matrices of the Gaussians are nearly low-rank. These conditions, which are reminiscent of the well-known Slepian-Wolf and Wyner-Ziv conditions, are a function of the number of linear features extracted from the signal of interest, the number of linear features extracted from the side information signal, and the geometry of these signals and their interplay. Moreover, on assuming that the signal of interest and the side information obey such an approximately low-rank model, we derive expansions of the reconstruction error as a function of the deviation from an exactly low-rank model; such expansions also allow identification of operational regimes where the impact of side information on signal reconstruction is most relevant. Our framework, which offers a principled mechanism to integrate side information in high-dimensional data problems, is also tested in the context of imaging applications.

CVJan 26, 2014
Painting Analysis Using Wavelets and Probabilistic Topic Models

Tong Wu, Gungor Polatkan, David Steel et al.

In this paper, computer-based techniques for stylistic analysis of paintings are applied to the five panels of the 14th century Peruzzi Altarpiece by Giotto di Bondone. Features are extracted by combining a dual-tree complex wavelet transform with a hidden Markov tree (HMT) model. Hierarchical clustering is used to identify stylistic keywords in image patches, and keyword frequencies are calculated for sub-images that each contains many patches. A generative hierarchical Bayesian model learns stylistic patterns of keywords; these patterns are then used to characterize the styles of the sub-images; this in turn, permits to discriminate between paintings. Results suggest that such unsupervised probabilistic topic models can be useful to distill characteristic elements of style.

LGJun 27, 2012
Communications Inspired Linear Discriminant Analysis

Minhua Chen, William Carson, Miguel Rodrigues et al.

We study the problem of supervised linear dimensionality reduction, taking an information-theoretic viewpoint. The linear projection matrix is designed by maximizing the mutual information between the projected signal and the class label (based on a Shannon entropy measure). By harnessing a recent theoretical result on the gradient of mutual information, the above optimization problem can be solved directly using gradient descent, without requiring simplification of the objective function. Theoretical analysis and empirical comparison are made between the proposed method and two closely related methods (Linear Discriminant Analysis and Information Discriminant Analysis), and comparisons are also made with a method in which Renyi entropy is used to define the mutual information (in this case the gradient may be computed simply, under a special parameter setting). Relative to these alternative approaches, the proposed method achieves promising results on real datasets.