Aswin C. Sankaranarayanan

CV
h-index13
23papers
918citations
Novelty56%
AI Score43

23 Papers

CVApr 11, 2022
Single-Photon Structured Light

Varun Sundar, Sizhuo Ma, Aswin C. Sankaranarayanan et al. · cmu

We present a novel structured light technique that uses Single Photon Avalanche Diode (SPAD) arrays to enable 3D scanning at high-frame rates and low-light levels. This technique, called "Single-Photon Structured Light", works by sensing binary images that indicates the presence or absence of photon arrivals during each exposure; the SPAD array is used in conjunction with a high-speed binary projector, with both devices operated at speeds as high as 20~kHz. The binary images that we acquire are heavily influenced by photon noise and are easily corrupted by ambient sources of light. To address this, we develop novel temporal sequences using error correction codes that are designed to be robust to short-range effects like projector and camera defocus as well as resolution mismatch between the two devices. Our lab prototype is capable of 3D imaging in challenging scenarios involving objects with extremely low albedo or undergoing fast motion, as well as scenes under strong ambient illumination.

CVSep 18, 2024
Precise Forecasting of Sky Images Using Spatial Warping

Leron Julian, Aswin C. Sankaranarayanan · cmu

The intermittency of solar power, due to occlusion from cloud cover, is one of the key factors inhibiting its widespread use in both commercial and residential settings. Hence, real-time forecasting of solar irradiance for grid-connected photovoltaic systems is necessary to schedule and allocate resources across the grid. Ground-based imagers that capture wide field-of-view images of the sky are commonly used to monitor cloud movement around a particular site in an effort to forecast solar irradiance. However, these wide FOV imagers capture a distorted image of sky image, where regions near the horizon are heavily compressed. This hinders the ability to precisely predict cloud motion near the horizon which especially affects prediction over longer time horizons. In this work, we combat the aforementioned constraint by introducing a deep learning method to predict a future sky image frame with higher resolution than previous methods. Our main contribution is to derive an optimal warping method to counter the adverse affects of clouds at the horizon, and learn a framework for future sky image prediction which better determines cloud evolution for longer time horizons.

CVJun 28, 2023
Angle Sensitive Pixels for Lensless Imaging on Spherical Sensors

Yi Hua, Yongyi Zhao, Aswin C. Sankaranarayanan · cmu

We propose OrbCam, a lensless architecture for imaging with spherical sensors. Prior work in lensless imager techniques have focused largely on using planar sensors; for such designs, it is important to use a modulation element, e.g. amplitude or phase masks, to construct a invertible imaging system. In contrast, we show that the diversity of pixel orientations on a curved surface is sufficient to improve the conditioning of the mapping between the scene and the sensor. Hence, when imaging on a spherical sensor, all pixels can have the same angular response function such that the lensless imager is comprised of pixels that are identical to each other and differ only in their orientations. We provide the computational tools for the design of the angular response of the pixels in a spherical sensor that leads to well-conditioned and noise-robust measurements. We validate our design in both simulation and a lab prototype. The implications of our design is that the lensless imaging can be enabled easily for curved and flexible surfaces thereby opening up a new set of application domains.

CVSep 18, 2024
Computational Imaging for Long-Term Prediction of Solar Irradiance

Leron Julian, Haejoon Lee, Soummya Kar et al.

The occlusion of the sun by clouds is one of the primary sources of uncertainties in solar power generation, and is a factor that affects the wide-spread use of solar power as a primary energy source. Real-time forecasting of cloud movement and, as a result, solar irradiance is necessary to schedule and allocate energy across grid-connected photovoltaic systems. Previous works monitored cloud movement using wide-angle field of view imagery of the sky. However, such images have poor resolution for clouds that appear near the horizon, which reduces their effectiveness for long term prediction of solar occlusion. Specifically, to be able to predict occlusion of the sun over long time periods, clouds that are near the horizon need to be detected, and their velocities estimated precisely. To enable such a system, we design and deploy a catadioptric system that delivers wide-angle imagery with uniform spatial resolution of the sky over its field of view. To enable prediction over a longer time horizon, we design an algorithm that uses carefully selected spatio-temporal slices of the imagery using estimated wind direction and velocity as inputs. Using ray-tracing simulations as well as a real testbed deployed outdoors, we show that the system is capable of predicting solar occlusion as well as irradiance for tens of minutes in the future, which is an order of magnitude improvement over prior work.

CVSep 12, 2025
Ordinality of Visible-Thermal Image Intensities for Intrinsic Image Decomposition

Zeqing Leo Yuan, Mani Ramanagopal, Aswin C. Sankaranarayanan et al.

Decomposing an image into its intrinsic photometric factors--shading and reflectance--is a long-standing challenge due to the lack of extensive ground-truth data for real-world scenes. Recent methods rely on synthetic data or sparse annotations for limited indoor and even fewer outdoor scenes. We introduce a novel training-free approach for intrinsic image decomposition using only a pair of visible and thermal images. We leverage the principle that light not reflected from an opaque surface is absorbed and detected as heat by a thermal camera. This allows us to relate the ordinalities between visible and thermal image intensities to the ordinalities of shading and reflectance, which can densely self-supervise an optimizing neural network to recover shading and reflectance. We perform quantitative evaluations with known reflectance and shading under natural and artificial lighting, and qualitative experiments across diverse outdoor scenes. The results demonstrate superior performance over recent learning-based models and point toward a scalable path to curating real-world ordinal supervision, previously infeasible via manual labeling.

CVJul 5, 2025
Towards Spatially-Varying Gain and Binning

Anqi Yang, Eunhee Kang, Wei Chen et al.

Pixels in image sensors have progressively become smaller, driven by the goal of producing higher-resolution imagery. However, ceteris paribus, a smaller pixel accumulates less light, making image quality worse. This interplay of resolution, noise, and the dynamic range of the sensor and their impact on the eventual quality of acquired imagery is a fundamental concept in photography. In this paper, we propose spatially-varying gain and binning to enhance the noise performance and dynamic range of image sensors. First, we show that by varying gain spatially to local scene brightness, the read noise can be made negligible, and the dynamic range of a sensor is expanded by an order of magnitude. Second, we propose a simple analysis to find a binning size that best balances resolution and noise for a given light level; this analysis predicts a spatially-varying binning strategy, again based on local scene brightness, to effectively increase the overall signal-to-noise ratio. % without sacrificing resolution. We discuss analog and digital binning modes and, perhaps surprisingly, show that digital binning outperforms its analog counterparts when a larger gain is allowed. Finally, we demonstrate that combining spatially-varying gain and binning in various applications, including high dynamic range imaging, vignetting, and lens distortion.

CVSep 29, 2021
Programmable Spectral Filter Arrays using Phase Spatial Light Modulator

Vishwanath Saragadam, Vijay Rengarajan, Ryuichi Tadano et al.

Spatially varying spectral modulation can be implemented using a liquid crystal spatial light modulator (SLM) since it provides an array of liquid crystal cells, each of which can be purposed to act as a programmable spectral filter array. However, such an optical setup suffers from strong optical aberrations due to the unintended phase modulation, precluding spectral modulation at high spatial resolutions. In this work, we propose a novel computational approach for the practical implementation of phase SLMs for implementing spatially varying spectral filters. We provide a careful and systematic analysis of the aberrations arising out of phase SLMs for the purposes of spatially varying spectral modulation. The analysis naturally leads us to a set of "good patterns" that minimize the optical aberrations. We then train a deep network that overcomes any residual aberrations, thereby achieving ideal spectral modulation at high spatial resolution. We show a number of unique operating points with our prototype including dynamic spectral filtering, material classification, and single- and multi-image hyperspectral imaging.

IVAug 18, 2021
A Simple Framework for 3D Lensless Imaging with Programmable Masks

Yucheng Zheng, Yi Hua, Aswin C. Sankaranarayanan et al.

Lensless cameras provide a framework to build thin imaging systems by replacing the lens in a conventional camera with an amplitude or phase mask near the sensor. Existing methods for lensless imaging can recover the depth and intensity of the scene, but they require solving computationally-expensive inverse problems. Furthermore, existing methods struggle to recover dense scenes with large depth variations. In this paper, we propose a lensless imaging system that captures a small number of measurements using different patterns on a programmable mask. In this context, we make three contributions. First, we present a fast recovery algorithm to recover textures on a fixed number of depth planes in the scene. Second, we consider the mask design problem, for programmable lensless cameras, and provide a design template for optimizing the mask patterns with the goal of improving depth estimation. Third, we use a refinement network as a post-processing step to identify and remove artifacts in the reconstruction. These modifications are evaluated extensively with experimental results on a lensless camera prototype to showcase the performance benefits of the optimized masks and recovery algorithms over the state of the art.

IVMay 2, 2020
Towards Occlusion-Aware Multifocal Displays

Jen-Hao Rick Chang, Anat Levin, B. V. K. Vijaya Kumar et al.

The human visual system uses numerous cues for depth perception, including disparity, accommodation, motion parallax and occlusion. It is incumbent upon virtual-reality displays to satisfy these cues to provide an immersive user experience. Multifocal displays, one of the classic approaches to satisfy the accommodation cue, place virtual content at multiple focal planes, each at a di erent depth. However, the content on focal planes close to the eye do not occlude those farther away; this deteriorates the occlusion cue as well as reduces contrast at depth discontinuities due to leakage of the defocus blur. This paper enables occlusion-aware multifocal displays using a novel ConeTilt operator that provides an additional degree of freedom -- tilting the light cone emitted at each pixel of the display panel. We show that, for scenes with relatively simple occlusion con gurations, tilting the light cones provides the same e ect as physical occlusion. We demonstrate that ConeTilt can be easily implemented by a phase-only spatial light modulator. Using a lab prototype, we show results that demonstrate the presence of occlusion cues and the increased contrast of the display at depth edges.

CVDec 11, 2019
Photosequencing of Motion Blur using Short and Long Exposures

Vijay Rengarajan, Shuo Zhao, Ruiwen Zhen et al.

Photosequencing aims to transform a motion blurred image to a sequence of sharp images. This problem is challenging due to the inherent ambiguities in temporal ordering as well as the recovery of lost spatial textures due to blur. Adopting a computational photography approach, we propose to capture two short exposure images, along with the original blurred long exposure image to aid in the aforementioned challenges. Post-capture, we recover the sharp photosequence using a novel blur decomposition strategy that recursively splits the long exposure image into smaller exposure intervals. We validate the approach by capturing a variety of scenes with interesting motions using machine vision cameras programmed to capture short and long exposure sequences. Our experimental results show that the proposed method resolves both fast and fine motions better than prior works.

IVMay 13, 2019
Programmable Spectrometry -- Per-pixel Classification of Materials using Learned Spectral Filters

Vishwanath Saragadam, Aswin C. Sankaranarayanan

Many materials have distinct spectral profiles. This facilitates estimation of the material composition of a scene at each pixel by first acquiring its hyperspectral image, and subsequently filtering it using a bank of spectral profiles. This process is inherently wasteful since only a set of linear projections of the acquired measurements contribute to the classification task. We propose a novel programmable camera that is capable of producing images of a scene with an arbitrary spectral filter. We use this camera to optically implement the spectral filtering of the scene's hyperspectral image with the bank of spectral profiles needed to perform per-pixel material classification. This provides gains both in terms of acquisition speed --- since only the relevant measurements are acquired --- and in signal-to-noise ratio --- since we invariably avoid narrowband filters that are light inefficient. Given training data, we use a range of classical and modern techniques including SVMs and neural networks to identify the bank of spectral profiles that facilitate material classification. We verify the method in simulations on standard datasets as well as real data using a lab prototype of the camera.

CVNov 29, 2018
Learning to Separate Multiple Illuminants in a Single Image

Zhuo Hui, Ayan Chakrabarti, Kalyan Sunkavalli et al.

We present a method to separate a single image captured under two illuminants, with different spectra, into the two images corresponding to the appearance of the scene under each individual illuminant. We do this by training a deep neural network to predict the per-pixel reflectance chromaticity of the scene, which we use in conjunction with a previous flash/no-flash image-based separation algorithm to produce the final two output images. We design our reflectance chromaticity network and loss functions by incorporating intuitions from the physics of image formation. We show that this leads to significantly better performance than other single image techniques and even approaches the quality of the two image separation method.

CVMay 27, 2018
Towards Multifocal Displays with Dense Focal Stacks

Jen-Hao Rick Chang, B. V. K. Vijaya Kumar, Aswin C. Sankaranarayanan

We present a virtual reality display that is capable of generating a dense collection of depth/focal planes. This is achieved by driving a focus-tunable lens to sweep a range of focal lengths at a high frequency and, subsequently, tracking the focal length precisely at microsecond time resolutions using an optical module. Precise tracking of the focal length, coupled with a high-speed display, enables our lab prototype to generate 1600 focal planes per second. This enables a novel first-of-its-kind virtual reality multifocal display that is capable of resolving the vergence-accommodation conflict endemic to today's displays.

IVJan 26, 2018
KRISM --- Krylov Subspace-based Optical Computing of Hyperspectral Images

Vishwanath Saragadam, Aswin C. Sankaranarayanan

We present an adaptive imaging technique that optically computes a low-rank approximation of a scene's hyperspectral image, conceptualized as a matrix. Central to the proposed technique is the optical implementation of two measurement operators: a spectrally-coded imager and a spatially-coded spectrometer. By iterating between the two operators, we show that the top singular vectors and singular values of a hyperspectral image can be adaptively and optically computed with only a few iterations. We present an optical design that uses pupil plane coding for implementing the two operations and show several compelling results using a lab prototype to demonstrate the effectiveness of the proposed hyperspectral imager.

CVApr 19, 2017
Illuminant Spectra-based Source Separation Using Flash Photography

Zhuo Hui, Kalyan Sunkavalli, Sunil Hadap et al.

Real-world lighting often consists of multiple illuminants with different spectra. Separating and manipulating these illuminants in post-process is a challenging problem that requires either significant manual input or calibrated scene geometry and lighting. In this work, we leverage a flash/no-flash image pair to analyze and edit scene illuminants based on their spectral differences. We derive a novel physics-based relationship between color variations in the observed flash/no-flash intensities and the spectra and surface shading corresponding to individual scene illuminants. Our technique uses this constraint to automatically separate an image into constituent images lit by each illuminant. This separation can be used to support applications like white balancing, lighting editing, and RGB photometric stereo, where we demonstrate results that outperform state-of-the-art techniques on a wide range of images.

CVMar 29, 2017
One Network to Solve Them All --- Solving Linear Inverse Problems using Deep Projection Models

J. H. Rick Chang, Chun-Liang Li, Barnabas Poczos et al.

While deep learning methods have achieved state-of-the-art performance in many challenging inverse problems like image inpainting and super-resolution, they invariably involve problem-specific training of the networks. Under this approach, different problems require different networks. In scenarios where we need to solve a wide variety of problems, e.g., on a mobile camera, it is inefficient and costly to use these specially-trained networks. On the other hand, traditional methods using signal priors can be used in all linear inverse problems but often have worse performance on challenging tasks. In this work, we provide a middle ground between the two kinds of methods --- we propose a general framework to train a single deep neural network that solves arbitrary linear inverse problems. The proposed network acts as a proximal operator for an optimization algorithm and projects non-image signals onto the set of natural images defined by the decision boundary of a classifier. In our experiments, the proposed framework demonstrates superior performance over traditional methods using a wavelet sparsity prior and achieves comparable performance of specially-trained networks on tasks including compressive sensing and pixel-wise inpainting.

CVApr 16, 2015
FPA-CS: Focal Plane Array-based Compressive Imaging in Short-wave Infrared

Huaijin Chen, M. Salman Asif, Aswin C. Sankaranarayanan et al.

Cameras for imaging in short and mid-wave infrared spectra are significantly more expensive than their counterparts in visible imaging. As a result, high-resolution imaging in those spectrum remains beyond the reach of most consumers. Over the last decade, compressive sensing (CS) has emerged as a potential means to realize inexpensive short-wave infrared cameras. One approach for doing this is the single-pixel camera (SPC) where a single detector acquires coded measurements of a high-resolution image. A computational reconstruction algorithm is then used to recover the image from these coded measurements. Unfortunately, the measurement rate of a SPC is insufficient to enable imaging at high spatial and temporal resolutions. We present a focal plane array-based compressive sensing (FPA-CS) architecture that achieves high spatial and temporal resolutions. The idea is to use an array of SPCs that sense in parallel to increase the measurement rate, and consequently, the achievable spatio-temporal resolution of the camera. We develop a proof-of-concept prototype in the short-wave infrared using a sensor with 64$\times$ 64 pixels; the prototype provides a 4096$\times$ increase in the measurement rate compared to the SPC and achieves a megapixel resolution at video rate using CS techniques.

CVMar 14, 2015
LiSens --- A Scalable Architecture for Video Compressive Sensing

Jian Wang, Mohit Gupta, Aswin C. Sankaranarayanan

The measurement rate of cameras that take spatially multiplexed measurements by using spatial light modulators (SLM) is often limited by the switching speed of the SLMs. This is especially true for single-pixel cameras where the photodetector operates at a rate that is many orders-of-magnitude greater than the SLM. We study the factors that determine the measurement rate for such spatial multiplexing cameras (SMC) and show that increasing the number of pixels in the device improves the measurement rate, but there is an optimum number of pixels (typically, few thousands) beyond which the measurement rate does not increase. This motivates the design of LiSens, a novel imaging architecture, that replaces the photodetector in the single-pixel camera with a 1D linear array or a line-sensor. We illustrate the optical architecture underlying LiSens, build a prototype, and demonstrate results of a range of indoor and outdoor scenes. LiSens delivers on the promise of SMCs: imaging at a megapixel resolution, at video rate, using an inexpensive low-resolution sensor.

CVMar 14, 2015
A Dictionary-based Approach for Estimating Shape and Spatially-Varying Reflectance

Zhuo Hui, Aswin C. Sankaranarayanan

We present a technique for estimating the shape and reflectance of an object in terms of its surface normals and spatially-varying BRDF. We assume that multiple images of the object are obtained under fixed view-point and varying illumination, i.e, the setting of photometric stereo. Assuming that the BRDF at each pixel lies in the non-negative span of a known BRDF dictionary, we derive a per-pixel surface normal and BRDF estimation framework that requires neither iterative optimization techniques nor careful initialization, both of which are endemic to most state-of-the-art techniques. We showcase the performance of our technique on a wide range of simulated and real scenes where we outperform competing methods.

OCMar 11, 2015
Adaptive-Rate Sparse Signal Reconstruction With Application in Compressive Background Subtraction

Joao F. C. Mota, Nikos Deligiannis, Aswin C. Sankaranarayanan et al.

We propose and analyze an online algorithm for reconstructing a sequence of signals from a limited number of linear measurements. The signals are assumed sparse, with unknown support, and evolve over time according to a generic nonlinear dynamical model. Our algorithm, based on recent theoretical results for $\ell_1$-$\ell_1$ minimization, is recursive and computes the number of measurements to be taken at each time on-the-fly. As an example, we apply the algorithm to compressive video background subtraction, a problem that can be stated as follows: given a set of measurements of a sequence of images with a static background, simultaneously reconstruct each image while separating its foreground from the background. The performance of our method is illustrated on sequences of real images: we observe that it allows a dramatic reduction in the number of measurements with respect to state-of-the-art compressive background subtraction schemes.

CVMar 9, 2015
Video Compressive Sensing for Spatial Multiplexing Cameras using Motion-Flow Models

Aswin C. Sankaranarayanan, Lina Xu, Christoph Studer et al.

Spatial multiplexing cameras (SMCs) acquire a (typically static) scene through a series of coded projections using a spatial light modulator (e.g., a digital micro-mirror device) and a few optical sensors. This approach finds use in imaging applications where full-frame sensors are either too expensive (e.g., for short-wave infrared wavelengths) or unavailable. Existing SMC systems reconstruct static scenes using techniques from compressive sensing (CS). For videos, however, existing acquisition and recovery methods deliver poor quality. In this paper, we propose the CS multi-scale video (CS-MUVI) sensing and recovery framework for high-quality video acquisition and recovery using SMCs. Our framework features novel sensing matrices that enable the efficient computation of a low-resolution video preview, while enabling high-resolution video recovery using convex optimization. To further improve the quality of the reconstructed videos, we extract optical-flow estimates from the low-resolution previews and impose them as constraints in the recovery procedure. We demonstrate the efficacy of our CS-MUVI framework for a host of synthetic and real measured SMC video data, and we show that high-quality videos can be recovered at roughly $60\times$ compression.

CVJan 30, 2014
Video Compressive Sensing for Dynamic MRI

Jianing V. Shi, Wotao Yin, Aswin C. Sankaranarayanan et al.

We present a video compressive sensing framework, termed kt-CSLDS, to accelerate the image acquisition process of dynamic magnetic resonance imaging (MRI). We are inspired by a state-of-the-art model for video compressive sensing that utilizes a linear dynamical system (LDS) to model the motion manifold. Given compressive measurements, the state sequence of an LDS can be first estimated using system identification techniques. We then reconstruct the observation matrix using a joint structured sparsity assumption. In particular, we minimize an objective function with a mixture of wavelet sparsity and joint sparsity within the observation matrix. We derive an efficient convex optimization algorithm through alternating direction method of multipliers (ADMM), and provide a theoretical guarantee for global convergence. We demonstrate the performance of our approach for video compressive sensing, in terms of reconstruction accuracy. We also investigate the impact of various sampling strategies. We apply this framework to accelerate the acquisition process of dynamic MRI and show it achieves the best reconstruction accuracy with the least computational time compared with existing algorithms in the literature.

LGMar 19, 2013
Greedy Feature Selection for Subspace Clustering

Eva L. Dyer, Aswin C. Sankaranarayanan, Richard G. Baraniuk

Unions of subspaces provide a powerful generalization to linear subspace models for collections of high-dimensional data. To learn a union of subspaces from a collection of data, sets of signals in the collection that belong to the same subspace must be identified in order to obtain accurate estimates of the subspace structures present in the data. Recently, sparse recovery methods have been shown to provide a provable and robust strategy for exact feature selection (EFS)--recovering subsets of points from the ensemble that live in the same subspace. In parallel with recent studies of EFS with L1-minimization, in this paper, we develop sufficient conditions for EFS with a greedy method for sparse signal recovery known as orthogonal matching pursuit (OMP). Following our analysis, we provide an empirical study of feature selection strategies for signals living on unions of subspaces and characterize the gap between sparse recovery methods and nearest neighbor (NN)-based approaches. In particular, we demonstrate that sparse recovery methods provide significant advantages over NN methods and the gap between the two approaches is particularly pronounced when the sampling of subspaces in the dataset is sparse. Our results suggest that OMP may be employed to reliably recover exact feature sets in a number of regimes where NN approaches fail to reveal the subspace membership of points in the ensemble.