SYMar 1, 2013
Designing Unimodular Codes via Quadratic Optimization is not Always HardMojtaba Soltanalian, Petre Stoica
The NP-hard problem of optimizing a quadratic form over the unimodular vector set arises in radar code design scenarios as well as other active sensing and communication applications. To tackle this problem (which we call unimodular quadratic programming (UQP)), several computational approaches are devised and studied. A specialized local optimization scheme for UQP is introduced and shown to yield superior results compared to general local optimization methods. Furthermore, a \textbf{m}onotonically \textbf{er}ror-bound \textbf{i}mproving \textbf{t}echnique (MERIT) is proposed to obtain the global optimum or a local optimum of UQP with good sub-optimality guarantees. The provided sub-optimality guarantees are case-dependent and generally outperform the $π/4$ approximation guarantee of semi-definite relaxation. Several numerical examples are presented to illustrate the performance of the proposed method. The examples show that for cases including several matrix structures used in radar code design, MERIT can solve UQP efficiently in the sense of sub-optimality guarantee and computational time.
OPTICSMar 3, 2022
Unfolding-Aided Bootstrapped Phase Retrieval in Optical ImagingSamuel Pinilla, Kumar Vijay Mishra, Igor Shevkunov et al.
Phase retrieval in optical imaging refers to the recovery of a complex signal from phaseless data acquired in the form of its diffraction patterns. These patterns are acquired through a system with a coherent light source that employs a diffractive optical element (DOE) to modulate the scene resulting in coded diffraction patterns at the sensor. Recently, the hybrid approach of model-driven network or deep unfolding has emerged as an effective alternative to conventional model-based and learning-based phase retrieval techniques because it allows for bounding the complexity of algorithms while also retaining their efficacy. Additionally, such hybrid approaches have shown promise in improving the design of DOEs that follow theoretical uniqueness conditions. There are opportunities to exploit novel experimental setups and resolve even more complex DOE phase retrieval applications. This paper presents an overview of algorithms and applications of deep unfolding for bootstrapped - regardless of near, middle, and far zones - phase retrieval.
ITDec 10, 2018
Signal Recovery From 1-Bit Quantized Noisy Samples via Adaptive ThresholdingShahin Khobahi, Mojtaba Soltanalian
In this paper, we consider the problem of signal recovery from 1-bit noisy measurements. We present an efficient method to obtain an estimation of the signal of interest when the measurements are corrupted by white or colored noise. To the best of our knowledge, the proposed framework is the pioneer effort in the area of 1-bit sampling and signal recovery in providing a unified framework to deal with the presence of noise with an arbitrary covariance matrix including that of the colored noise. The proposed method is based on a constrained quadratic program (CQP) formulation utilizing an adaptive quantization thresholding approach, that further enables us to accurately recover the signal of interest from its 1-bit noisy measurements. In addition, due to the adaptive nature of the proposed method, it can recover both fixed and time-varying parameters from their quantized 1-bit samples.
SYJul 31, 2018
Optimized Transmission for Consensus in Wireless Sensor NetworksShahin Khobahi, Mojtaba Soltanalian
In this paper, we present a consensus-based framework for decentralized estimation of deterministic parameters in wireless sensor networks (WSNs). In particular, we propose an optimization algorithm to design (possibly complex) sensor gains in order to achieve an estimate of the parameter of interest that is as accurate as possible. The proposed design algorithm employs a cyclic approach capable of handling various sensor gain constraints. In addition, each iteration of the proposed design framework is comprised of the Gram-Schmidt process and power-method like iterations, and as a result, enjoys a low-computational cost.
SPMar 13, 2022
One-Bit Compressive Sensing: Can We Go Deep and Blind?Yiming Zeng, Shahin Khobahi, Mojtaba Soltanalian
One-bit compressive sensing is concerned with the accurate recovery of an underlying sparse signal of interest from its one-bit noisy measurements. The conventional signal recovery approaches for this problem are mainly developed based on the assumption that an exact knowledge of the sensing matrix is available. In this work, however, we present a novel data-driven and model-based methodology that achieves blind recovery; i.e., signal recovery without requiring the knowledge of the sensing matrix. To this end, we make use of the deep unfolding technique and develop a model-driven deep neural architecture which is designed for this specific task. The proposed deep architecture is able to learn an alternative sensing matrix by taking advantage of the underlying unfolded algorithm such that the resulting learned recovery algorithm can accurately and quickly (in terms of the number of iterations) recover the underlying compressed signal of interest from its one-bit noisy measurements. In addition, due to the incorporation of the domain knowledge and the mathematical model of the system into the proposed deep architecture, the resulting network benefits from enhanced interpretability, has a very small number of trainable parameters, and requires very small number of training samples, as compared to the commonly used black-box deep neural network alternatives for the problem at hand.
SPJul 31, 2023
Deep Learning Meets Adaptive Filtering: A Stein's Unbiased Risk Estimator ApproachZahra Esmaeilbeig, Mojtaba Soltanalian
This paper revisits two prominent adaptive filtering algorithms, namely recursive least squares (RLS) and equivariant adaptive source separation (EASI), through the lens of algorithm unrolling. Building upon the unrolling methodology, we introduce novel task-based deep learning frameworks, denoted as Deep RLS and Deep EASI. These architectures transform the iterations of the original algorithms into layers of a deep neural network, enabling efficient source signal estimation by leveraging a training process. To further enhance performance, we propose training these deep unrolled networks utilizing a surrogate loss function grounded on Stein's unbiased risk estimator (SURE). Our empirical evaluations demonstrate that the Deep RLS and Deep EASI networks outperform their underlying algorithms. Moreover, the efficacy of SURE-based training in comparison to conventional mean squared error loss is highlighted by numerical experiments. The unleashed potential of SURE-based training in this paper sets a benchmark for future employment of SURE either for training purposes or as an evaluation metric for generalization performance of neural networks.
LGFeb 9
Linearization Explains Fine-Tuning in Large Language ModelsZahra Rahimi Afzal, Tara Esmaeilbeig, Mojtaba Soltanalian et al.
Parameter-Efficient Fine-Tuning (PEFT) is a popular class of techniques that strive to adapt large models in a scalable and resource-efficient manner. Yet, the mechanisms underlying their training performance and generalization remain underexplored. In this paper, we provide several insights into such fine-tuning through the lens of linearization. Fine-tuned models are often implicitly encouraged to remain close to the pretrained model. By making this explicit, using an Euclidean distance inductive bias in parameter space, we show that fine-tuning dynamics become equivalent to learning with the positive-definite neural tangent kernel (NTK). We specifically analyze how close the fully linear and the linearized fine-tuning optimizations are, based on the strength of the regularization. This allows us to be pragmatic about how good a model linearization is when fine-tuning large language models (LLMs). When linearization is a good model, our findings reveal a strong correlation between the eigenvalue spectrum of the NTK and the performance of model adaptation. Motivated by this, we give spectral perturbation bounds on the NTK induced by the choice of layers selected for fine-tuning. We empirically validate our theory on Low Rank Adaptation (LoRA) on LLMs. These insights not only characterize fine-tuning but also have the potential to enhance PEFT techniques, paving the way to better informed and more nimble adaptation in LLMs.
LGMay 4
Trust, but Verify: Peeling Low-Bit Transformer Networks for Training MonitoringArian Eamaz, Farhang Yeganegi, Mojtaba Soltanalian
Understanding whether deep neural networks are effectively optimized remains challenging, as training occurs in highly nonconvex landscapes and standard metrics provide limited visibility into layer-wise learning quality. This challenge is particularly acute for transformer-based language models, where training is expensive, models are often reused in frozen form, and poorly optimized layers can silently degrade performance. We propose a layer-wise peeling framework for monitoring training dynamics, in which each transformer layer is locally optimized against intermediate representations of the trained model. By constructing lightweight, layer-specific reference solutions and projecting layers onto multiple intermediate outputs via different permutations, we obtain achievable baselines that enable fine-grained diagnosis of under-optimized layers. Experiments on decoder-only transformer models show that these layer-wise reference bounds can match or even surpass the trained model at various stages of training, exposing inefficiencies that remain hidden in aggregate loss curves. We further demonstrate that this analysis remains effective under binarization and quantized settings, where training dynamics are particularly fragile. Across all numerical results, the proposed bounds consistently separate apparent convergence from effective optimality, highlighting optimization opportunities that are invisible when relying on training loss alone.
CLOct 14, 2024
RoCoFT: Efficient Finetuning of Large Language Models with Row-Column UpdatesMd Kowsher, Tara Esmaeilbeig, Chun-Nam Yu et al.
We propose RoCoFT, a parameter-efficient fine-tuning method for large-scale language models (LMs) based on updating only a few rows and columns of the weight matrices in transformers. Through extensive experiments with medium-size LMs like BERT and RoBERTa, and larger LMs like Bloom-7B, Llama2-7B, and Llama2-13B, we show that our method gives comparable or better accuracies than state-of-art PEFT methods while also being more memory and computation-efficient. We also study the reason behind the effectiveness of our method with tools from neural tangent kernel theory. We empirically demonstrate that our kernel, constructed using a restricted set of row and column parameters, are numerically close to the full-parameter kernel and gives comparable classification performance. Ablation studies are conducted to investigate the impact of different algorithmic choices, including the selection strategy for rows and columns as well as the optimal rank for effective implementation of our method.
CLFeb 25, 2025
Predicting Through Generation: Why Generation Is Better for PredictionMd Kowsher, Nusrat Jahan Prottasha, Prakash Bhat et al.
This paper argues that generating output tokens is more effective than using pooled representations for prediction tasks because token-level generation retains more mutual information. Since LLMs are trained on massive text corpora using next-token prediction, generation aligns naturally with their learned behavior. Using the Data Processing Inequality (DPI), we provide both theoretical and empirical evidence supporting this claim. However, autoregressive models face two key challenges when used for prediction: (1) exposure bias, where the model sees ground truth tokens during training but relies on its own predictions during inference, leading to errors, and (2) format mismatch, where discrete tokens do not always align with the tasks required output structure. To address these challenges, we introduce PredGen(Predicting Through Generating), an end to end framework that (i) uses scheduled sampling to reduce exposure bias, and (ii) introduces a task adapter to convert the generated tokens into structured outputs. Additionally, we introduce Writer-Director Alignment Loss (WDAL), which ensures consistency between token generation and final task predictions, improving both text coherence and numerical accuracy. We evaluate PredGen on multiple classification and regression benchmarks. Our results show that PredGen consistently outperforms standard baselines, demonstrating its effectiveness in structured prediction tasks.
LGFeb 4, 2025
Physics-Inspired Binary Neural Networks: Interpretable Compression with Theoretical GuaranteesArian Eamaz, Farhang Yeganegi, Mojtaba Soltanalian
Why rely on dense neural networks and then blindly sparsify them when prior knowledge about the problem structure is already available? Many inverse problems admit algorithm-unrolled networks that naturally encode physics and sparsity. In this work, we propose a Physics-Inspired Binary Neural Network (PIBiNN) that combines two key components: (i) data-driven one-bit quantization with a single global scale, and (ii) problem-driven sparsity predefined by physics and requiring no updates during training. This design yields compression rates below one bit per weight by exploiting structural zeros, while preserving essential operator geometry. Unlike ternary or pruning-based schemes, our approach avoids ad-hoc sparsification, reduces metadata overhead, and aligns directly with the underlying task. Experiments suggest that PIBiNN achieves advantages in both memory efficiency and generalization compared to competitive baselines such as ternary and channel-wise quantization.
LGOct 14, 2024
Data-Aware Training Quality Monitoring and Certification for Reliable Deep LearningFarhang Yeganegi, Arian Eamaz, Mojtaba Soltanalian
Deep learning models excel at capturing complex representations through sequential layers of linear and non-linear transformations, yet their inherent black-box nature and multi-modal training landscape raise critical concerns about reliability, robustness, and safety, particularly in high-stakes applications. To address these challenges, we introduce YES training bounds, a novel framework for real-time, data-aware certification and monitoring of neural network training. The YES bounds evaluate the efficiency of data utilization and optimization dynamics, providing an effective tool for assessing progress and detecting suboptimal behavior during training. Our experiments show that the YES bounds offer insights beyond conventional local optimization perspectives, such as identifying when training losses plateau in suboptimal regions. Validated on both synthetic and real data, including image denoising tasks, the bounds prove effective in certifying training quality and guiding adjustments to enhance model performance. By integrating these bounds into a color-coded cloud-based monitoring system, we offer a powerful tool for real-time evaluation, setting a new standard for training quality assurance in deep learning.
SPFeb 5, 2021
LoRD-Net: Unfolded Deep Detection Network with Low-Resolution ReceiversShahin Khobahi, Nir Shlezinger, Mojtaba Soltanalian et al.
The need to recover high-dimensional signals from their noisy low-resolution quantized measurements is widely encountered in communications and sensing. In this paper, we focus on the extreme case of one-bit quantizers, and propose a deep detector entitled LoRD-Net for recovering information symbols from one-bit measurements. Our method is a model-aware data-driven architecture based on deep unfolding of first-order optimization iterations. LoRD-Net has a task-based architecture dedicated to recovering the underlying signal of interest from the one-bit noisy measurements without requiring prior knowledge of the channel matrix through which the one-bit measurements are obtained. The proposed deep detector has much fewer parameters compared to black-box deep networks due to the incorporation of domain-knowledge in the design of its architecture, allowing it to operate in a data-driven fashion while benefiting from the flexibility, versatility, and reliability of model-based optimization methods. LoRD-Net operates in a blind fashion, which requires addressing both the non-linear nature of the data-acquisition system as well as identifying a proper optimization objective for signal recovery. Accordingly, we propose a two-stage training method for LoRD-Net, in which the first stage is dedicated to identifying the proper form of the optimization process to unfold, while the latter trains the resulting model in an end-to-end manner. We numerically evaluate the proposed receiver architecture for one-bit signal recovery in wireless communications and demonstrate that the proposed hybrid methodology outperforms both data-driven and model-based state-of-the-art methods, while utilizing small datasets, on the order of merely $\sim 500$ samples, for training.
MLDec 21, 2020
Unfolded Algorithms for Deep Phase RetrievalNaveed Naimipour, Shahin Khobahi, Mojtaba Soltanalian
Exploring the idea of phase retrieval has been intriguing researchers for decades, due to its appearance in a wide range of applications. The task of a phase retrieval algorithm is typically to recover a signal from linear phaseless measurements. In this paper, we approach the problem by proposing a hybrid model-based data-driven deep architecture, referred to as Unfolded Phase Retrieval (UPR), that exhibits significant potential in improving the performance of state-of-the art data-driven and model-based phase retrieval algorithms. The proposed method benefits from versatility and interpretability of well-established model-based algorithms, while simultaneously benefiting from the expressive power of deep neural networks. In particular, our proposed model-based deep architecture is applied to the conventional phase retrieval problem (via the incremental reshaped Wirtinger flow algorithm) and the sparse phase retrieval problem (via the sparse truncated amplitude flow algorithm), showing immense promise in both cases. Furthermore, we consider a joint design of the sensing matrix and the signal processing algorithm and utilize the deep unfolding technique in the process. Our numerical results illustrate the effectiveness of such hybrid model-based and data-driven frameworks and showcase the untapped potential of data-aided methodologies to enhance the existing phase retrieval algorithms.
SPNov 15, 2020
Deep-RLS: A Model-Inspired Deep Learning Approach to Nonlinear PCAZahra Esmaeilbeig, Shahin Khobahi, Mojtaba Soltanalian
In this work, we consider the application of model-based deep learning in nonlinear principal component analysis (PCA). Inspired by the deep unfolding methodology, we propose a task-based deep learning approach, referred to as Deep-RLS, that unfolds the iterations of the well-known recursive least squares (RLS) algorithm into the layers of a deep neural network in order to perform nonlinear PCA. In particular, we formulate the nonlinear PCA for the blind source separation (BSS) problem and show through numerical analysis that Deep-RLS results in a significant improvement in the accuracy of recovering the source signals in BSS when compared to the traditional RLS algorithm.
SPMar 9, 2020
UPR: A Model-Driven Architecture for Deep Phase RetrievalNaveed Naimipour, Shahin Khobahi, Mojtaba Soltanalian
The problem of phase retrieval has been intriguing researchers for decades due to its appearance in a wide range of applications. The task of a phase retrieval algorithm is typically to recover a signal from linear phase-less measurements. In this paper, we approach the problem by proposing a hybrid model-based data-driven deep architecture, referred to as the Unfolded Phase Retrieval (UPR), that shows potential in improving the performance of the state-of-the-art phase retrieval algorithms. Specifically, the proposed method benefits from versatility and interpretability of well established model-based algorithms, while simultaneously benefiting from the expressive power of deep neural networks. Our numerical results illustrate the effectiveness of such hybrid deep architectures and showcase the untapped potential of data-aided methodologies to enhance the existing phase retrieval algorithms.
CVFeb 3, 2020
Deep-URL: A Model-Aware Approach To Blind Deconvolution Based On Deep Unfolded Richardson-Lucy NetworkChirag Agarwal, Shahin Khobahi, Arindam Bose et al.
The lack of interpretability in current deep learning models causes serious concerns as they are extensively used for various life-critical applications. Hence, it is of paramount importance to develop interpretable deep learning models. In this paper, we consider the problem of blind deconvolution and propose a novel model-aware deep architecture that allows for the recovery of both the blur kernel and the sharp image from the blurred image. In particular, we propose the Deep Unfolded Richardson-Lucy (Deep-URL) framework -- an interpretable deep-learning architecture that can be seen as an amalgamation of classical estimation technique and deep neural network, and consequently leads to improved performance. Our numerical investigations demonstrate significant improvement compared to state-of-the-art algorithms.
SPDec 17, 2019
Deep Radar Waveform Design for Efficient Automotive Radar SensingShahin Khobahi, Arindam Bose, Mojtaba Soltanalian
In radar systems, unimodular (or constant-modulus) waveform design plays an important role in achieving better clutter/interference rejection, as well as a more accurate estimation of the target parameters. The design of such sequences has been studied widely in the last few decades, with most design algorithms requiring sophisticated a priori knowledge of environmental parameters which may be difficult to obtain in real-time scenarios. In this paper, we propose a novel hybrid model-driven and data-driven architecture that adapts to the ever changing environment and allows for adaptive unimodular waveform design. In particular, the approach lays the groundwork for developing extremely low-cost waveform design and processing frameworks for radar systems deployed in autonomous vehicles. The proposed model-based deep architecture imitates a well-known unimodular signal design algorithm in its structure, and can quickly infer statistical information from the environment using the observed data. Our numerical experiments portray the advantages of using the proposed method for efficient radar waveform design in time-varying environments.
LGDec 10, 2019
Deep One-bit Compressive AutoencodingShahin Khobahi, Arindam Bose, Mojtaba Soltanalian
Parameterized mathematical models play a central role in understanding and design of complex information systems. However, they often cannot take into account the intricate interactions innate to such systems. On the contrary, purely data-driven approaches do not need explicit mathematical models for data generation and have a wider applicability at the cost of interpretability. In this paper, we consider the design of a one-bit compressive autoencoder, and propose a novel hybrid model-based and data-driven methodology that allows us to not only design the sensing matrix for one-bit data acquisition, but also allows for learning the latent-parameters of an iterative optimization algorithm specifically designed for the problem of one-bit sparse signal recovery. Our results demonstrate a significant improvement compared to state-of-the-art model-based algorithms.
SPNov 27, 2019
Model-Aware Deep Architectures for One-Bit Compressive Variational AutoencodingShahin Khobahi, Mojtaba Soltanalian
Parameterized mathematical models play a central role in understanding and design of complex information systems. However, they often cannot take into account the intricate interactions innate to such systems. On the contrary, purely data-driven approaches do not need explicit mathematical models for data generation and have a wider applicability at the cost of interpretability. In this paper, we consider the design of a one-bit compressive variational autoencoder, and propose a novel hybrid model-based and data-driven methodology that allows us not only to design the sensing matrix and the quantization thresholds for one-bit data acquisition, but also allows for learning the latent-parameters of iterative optimization algorithms specifically designed for the problem of one-bit sparse signal recovery. In addition, the proposed method has the ability to adaptively learn the proper quantization thresholds, paving the way for amplitude recovery in one-bit compressive sensing. Our results demonstrate a significant improvement compared to state-of-the-art model-based algorithms.
IRJun 6, 2019
Comprehensive Personalized Ranking Using One-Bit Comparison DataAria Ameri, Arindam Bose, Mojtaba Soltanalian
The task of a personalization system is to recommend items or a set of items according to the users' taste, and thus predicting their future needs. In this paper, we address such personalized recommendation problems for which one-bit comparison data of user preferences for different items as well as the different user inclinations toward an item are available. We devise a comprehensive personalized ranking (CPR) system by employing a Bayesian treatment. We also provide a connection to the learning method with respect to the CPR optimization criterion to learn the underlying low-rank structure of the rating matrix based on the well-established matrix factorization method. Numerical results are provided to verify the performance of our algorithm.
SPNov 30, 2018
Deep Signal Recovery with One-Bit QuantizationShahin Khobahi, Naveed Naimipour, Mojtaba Soltanalian et al.
Machine learning, and more specifically deep learning, have shown remarkable performance in sensing, communications, and inference. In this paper, we consider the application of the deep unfolding technique in the problem of signal reconstruction from its one-bit noisy measurements. Namely, we propose a model-based machine learning method and unfold the iterations of an inference optimization algorithm into the layers of a deep neural network for one-bit signal recovery. The resulting network, which we refer to as DeepRec, can efficiently handle the recovery of high-dimensional signals from acquired one-bit noisy measurements. The proposed method results in an improvement in accuracy and computational efficiency with respect to the original framework as shown through numerical analysis.