NEOct 9, 2022
Boost Event-Driven Tactile Learning with Location Spiking NeuronsPeng Kang, Srutarshi Banerjee, Henry Chopp et al.
Tactile sensing is essential for a variety of daily tasks. And recent advances in event-driven tactile sensors and Spiking Neural Networks (SNNs) spur the research in related fields. However, SNN-enabled event-driven tactile learning is still in its infancy due to the limited representation abilities of existing spiking neurons and high spatio-temporal complexity in the event-driven tactile data. In this paper, to improve the representation capability of existing spiking neurons, we propose a novel neuron model called "location spiking neuron", which enables us to extract features of event-based data in a novel way. Specifically, based on the classical Time Spike Response Model (TSRM), we develop the Location Spike Response Model (LSRM). In addition, based on the most commonly-used Time Leaky Integrate-and-Fire (TLIF) model, we develop the Location Leaky Integrate-and-Fire (LLIF) model. Moreover, to demonstrate the representation effectiveness of our proposed neurons and capture the complex spatio-temporal dependencies in the event-driven tactile data, we exploit the location spiking neurons to propose two hybrid models for event-driven tactile learning. Specifically, the first hybrid model combines a fully-connected SNN with TSRM neurons and a fully-connected SNN with LSRM neurons. And the second hybrid model fuses the spatial spiking graph neural network with TLIF neurons and the temporal spiking graph neural network with LLIF neurons. Extensive experiments demonstrate the significant improvements of our models over the state-of-the-art methods on event-driven tactile learning. Moreover, compared to the counterpart artificial neural networks (ANNs), our SNN models are 10x to 100x energy-efficient, which shows the superior energy efficiency of our models and may bring new opportunities to the spike-based learning community and neuromorphic engineering.
NEJul 23, 2022
Event-Driven Tactile Learning with Location Spiking NeuronsPeng Kang, Srutarshi Banerjee, Henry Chopp et al.
The sense of touch is essential for a variety of daily tasks. New advances in event-based tactile sensors and Spiking Neural Networks (SNNs) spur the research in event-driven tactile learning. However, SNN-enabled event-driven tactile learning is still in its infancy due to the limited representative abilities of existing spiking neurons and high spatio-temporal complexity in the data. In this paper, to improve the representative capabilities of existing spiking neurons, we propose a novel neuron model called "location spiking neuron", which enables us to extract features of event-based data in a novel way. Moreover, based on the classical Time Spike Response Model (TSRM), we develop a specific location spiking neuron model - Location Spike Response Model (LSRM) that serves as a new building block of SNNs. Furthermore, we propose a hybrid model which combines an SNN with TSRM neurons and an SNN with LSRM neurons to capture the complex spatio-temporal dependencies in the data. Extensive experiments demonstrate the significant improvements of our models over other works on event-driven tactile learning and show the superior energy efficiency of our models and location spiking neurons, which may unlock their potential on neuromorphic hardware.
IVAug 26, 2024
FCDM: A Physics-Guided Bidirectional Frequency Aware Convolution and Diffusion-Based Model for Sinogram InpaintingJiaze E, Srutarshi Banerjee, Tekin Bicer et al.
Computed tomography (CT) is widely used in scientific imaging systems such as synchrotron and laboratory-based nano-CT, but acquiring full-view sinograms requires high radiation dose and long scan times. Sparse-view CT reduces this burden but produces incomplete sinograms with structured signal loss, degrading reconstruction quality. Unlike RGB images, sinograms encode globally coupled projections and exhibit directional spectral patterns, making conventional RGB-oriented inpainting methods, including diffusion models, ineffective because they ignore angular dependencies and physical constraints inherent to tomographic data. We propose FCDM, a diffusion-based framework for sinogram restoration that incorporates bidirectional frequency reasoning, angular-aware masking, and physics-guided regularization to preserve global structure and physical plausibility. Experiments on real-world datasets show that FCDM consistently outperforms existing baselines, achieving over 0.93 SSIM and 31 dB PSNR across diverse sparse-view settings.
AIFeb 17
EAA: Automating materials characterization with vision language model agentsMing Du, Yanqi Luo, Srutarshi Banerjee et al.
We present Experiment Automation Agents (EAA), a vision-language-model-driven agentic system designed to automate complex experimental microscopy workflows. EAA integrates multimodal reasoning, tool-augmented action, and optional long-term memory to support both autonomous procedures and interactive user-guided measurements. Built on a flexible task-manager architecture, the system enables workflows ranging from fully agent-driven automation to logic-defined routines that embed localized LLM queries. EAA further provides a modern tool ecosystem with two-way compatibility for Model Context Protocol (MCP), allowing instrument-control tools to be consumed or served across applications. We demonstrate EAA at an imaging beamline at the Advanced Photon Source, including automated zone plate focusing, natural language-described feature search, and interactive data acquisition. These results illustrate how vision-capable agents can enhance beamline efficiency, reduce operational burden, and lower the expertise barrier for users.
NEDec 26, 2023
Event-based Shape from Polarization with Spiking Neural NetworksPeng Kang, Srutarshi Banerjee, Henry Chopp et al.
Recent advances in event-based shape determination from polarization offer a transformative approach that tackles the trade-off between speed and accuracy in capturing surface geometries. In this paper, we investigate event-based shape from polarization using Spiking Neural Networks (SNNs), introducing the Single-Timestep and Multi-Timestep Spiking UNets for effective and efficient surface normal estimation. Specificially, the Single-Timestep model processes event-based shape as a non-temporal task, updating the membrane potential of each spiking neuron only once, thereby reducing computational and energy demands. In contrast, the Multi-Timestep model exploits temporal dynamics for enhanced data extraction. Extensive evaluations on synthetic and real-world datasets demonstrate that our models match the performance of state-of-the-art Artifical Neural Networks (ANNs) in estimating surface normals, with the added advantage of superior energy efficiency. Our work not only contributes to the advancement of SNNs in event-based sensing but also sets the stage for future explorations in optimizing SNN architectures, integrating multi-modal data, and scaling for applications on neuromorphic hardware.
CVJun 10, 2025
HiSin: A Sinogram-Aware Framework for Efficient High-Resolution InpaintingJiaze E, Srutarshi Banerjee, Tekin Bicer et al.
High-resolution sinogram inpainting is essential for computed tomography reconstruction, as missing high-frequency projections can lead to visible artifacts and diagnostic errors. Diffusion models are well-suited for this task due to their robustness and detail-preserving capabilities, but their application to high-resolution inputs is limited by excessive memory and computational demands. To address this limitation, we propose HiSin, a novel diffusion-based framework for efficient sinogram inpainting that exploits spectral sparsity and structural heterogeneity of projection data. It progressively extracts global structure at low resolution and defers high-resolution inference to small patches, enabling memory-efficient inpainting. Considering the structural features of sinograms, we incorporate frequency-aware patch skipping and structure-adaptive step allocation to reduce redundant computation. Experimental results show that HiSin reduces peak memory usage by up to 30.81% and inference time by up to 17.58% than the state-of-the-art framework, and maintains inpainting accuracy across.
IVMay 12, 2021
Removing Blocking Artifacts in Video Streams Using Event CamerasHenry H. Chopp, Srutarshi Banerjee, Oliver Cossairt et al.
In this paper, we propose EveRestNet, a convolutional neural network designed to remove blocking artifacts in videostreams using events from neuromorphic sensors. We first degrade the video frame using a quadtree structure to produce the blocking artifacts to simulate transmitting a video under a heavily constrained bandwidth. Events from the neuromorphic sensor are also simulated, but are transmitted in full. Using the distorted frames and the event stream, EveRestNet is able to improve the image quality.
CVMay 3, 2020
Lossy Event Compression based on Image-derived Quad Trees and Poisson Disk SamplingSrutarshi Banerjee, Zihao W. Wang, Henry H. Chopp et al.
With several advantages over conventional RGB cameras, event cameras have provided new opportunities for tackling visual tasks under challenging scenarios with fast motion, high dynamic range, and/or power constraint. Yet unlike image/video compression, the performance of event compression algorithm is far from satisfying and practical. The main challenge for compressing events is the unique event data form, i.e., a stream of asynchronously fired event tuples each encoding the 2D spatial location, timestamp, and polarity (denoting an increase or decrease in brightness). Since events only encode temporal variations, they lack spatial structure which is crucial for compression. To address this problem, we propose a novel event compression algorithm based on a quad tree (QT) segmentation map derived from the adjacent intensity images. The QT informs 2D spatial priority within the 3D space-time volume. In the event encoding step, events are first aggregated over time to form polarity-based event histograms. The histograms are then variably sampled via Poisson Disk Sampling prioritized by the QT based segmentation map. Next, differential encoding and run length encoding are employed for encoding the spatial and polarity information of the sampled events, respectively, followed by Huffman encoding to produce the final encoded events. Our Poisson Disk Sampling based Lossy Event Compression (PDS-LEC) algorithm performs rate-distortion based optimal allocation. On average, our algorithm achieves greater than 6x compression compared to the state of the art.