Tao Sun

h-index11

9papers

234citations

Novelty58%

AI Score32

Ranked #126,112 of 194,257 authors (top 65%)#41,797 in CV (top 71%)

9 Papers

10.8NEApr 20, 2023

Efficient Uncertainty Estimation in Spiking Neural Networks via MC-dropout

Tao Sun, Bojian Yin, Sander Bohte

Spiking neural networks (SNNs) have gained attention as models of sparse and event-driven communication of biological neurons, and as such have shown increasing promise for energy-efficient applications in neuromorphic hardware. As with classical artificial neural networks (ANNs), predictive uncertainties are important for decision making in high-stakes applications, such as autonomous vehicles, medical diagnosis, and high frequency trading. Yet, discussion of uncertainty estimation in SNNs is limited, and approaches for uncertainty estimation in artificial neural networks (ANNs) are not directly applicable to SNNs. Here, we propose an efficient Monte Carlo(MC)-dropout based approach for uncertainty estimation in SNNs. Our approach exploits the time-step mechanism of SNNs to enable MC-dropout in a computationally efficient manner, without introducing significant overheads during training and inference while demonstrating high accuracy and uncertainty quality.

9.6NEFeb 14, 2023

Hybrid Spiking Neural Network Fine-tuning for Hippocampus Segmentation

Ye Yue, Marc Baltes, Nidal Abujahar et al.

Over the past decade, artificial neural networks (ANNs) have made tremendous advances, in part due to the increased availability of annotated data. However, ANNs typically require significant power and memory consumptions to reach their full potential. Spiking neural networks (SNNs) have recently emerged as a low-power alternative to ANNs due to their sparsity nature. SNN, however, are not as easy to train as ANNs. In this work, we propose a hybrid SNN training scheme and apply it to segment human hippocampi from magnetic resonance images. Our approach takes ANN-SNN conversion as an initialization step and relies on spike-based backpropagation to fine-tune the network. Compared with the conversion and direct training solutions, our method has advantages in both segmentation accuracy and training efficiency. Experiments demonstrate the effectiveness of our model in achieving the design goals.

9.7SDAug 14, 2024

DPSNN: Spiking Neural Network for Low-Latency Streaming Speech Enhancement

Tao Sun, Sander Bohté

Speech enhancement (SE) improves communication in noisy environments, affecting areas such as automatic speech recognition, hearing aids, and telecommunications. With these domains typically being power-constrained and event-based while requiring low latency, neuromorphic algorithms in the form of spiking neural networks (SNNs) have great potential. Yet, current effective SNN solutions require a contextual sampling window imposing substantial latency, typically around 32ms, too long for many applications. Inspired by Dual-Path Spiking Neural Networks (DPSNNs) in classical neural networks, we develop a two-phase time-domain streaming SNN framework -- the Dual-Path Spiking Neural Network (DPSNN). In the DPSNN, the first phase uses Spiking Convolutional Neural Networks (SCNNs) to capture global contextual information, while the second phase uses Spiking Recurrent Neural Networks (SRNNs) to focus on frequency-related features. In addition, the regularizer suppresses activation to further enhance energy efficiency of our DPSNNs. Evaluating on the VCTK and Intel DNS Datasets, we demonstrate that our approach achieves the very low latency (approximately 5ms) required for applications like hearing aids, while demonstrating excellent signal-to-noise ratio (SNR), perceptual quality, and energy efficiency.

4.6LGNov 29, 2024

Average-Over-Time Spiking Neural Networks for Uncertainty Estimation in Regression

Tao Sun, Sander Bohté

Uncertainty estimation is a standard tool to quantify the reliability of modern deep learning models, and crucial for many real-world applications. However, efficient uncertainty estimation methods for spiking neural networks, particularly for regression models, have been lacking. Here, we introduce two methods that adapt the Average-Over-Time Spiking Neural Network (AOT-SNN) framework to regression tasks, enhancing uncertainty estimation in event-driven models. The first method uses the heteroscedastic Gaussian approach, where SNNs predict both the mean and variance at each time step, thereby generating a conditional probability distribution of the target variable. The second method leverages the Regression-as-Classification (RAC) approach, reformulating regression as a classification problem to facilitate uncertainty estimation. We evaluate our approaches on both a toy dataset and several benchmark datasets, demonstrating that the proposed AOT-SNN models achieve performance comparable to or better than state-of-the-art deep neural network methods, particularly in uncertainty estimation. Our findings highlight the potential of SNNs for uncertainty estimation in regression tasks, providing an efficient and biologically inspired alternative for applications requiring both accuracy and energy efficiency.

3.9CVMay 23, 2023

Temporal Contrastive Learning for Spiking Neural Networks

Haonan Qiu, Zeyin Song, Yanqi Chen et al.

Biologically inspired spiking neural networks (SNNs) have garnered considerable attention due to their low-energy consumption and spatio-temporal information processing capabilities. Most existing SNNs training methods first integrate output information across time steps, then adopt the cross-entropy (CE) loss to supervise the prediction of the average representations. However, in this work, we find the method above is not ideal for the SNNs training as it omits the temporal dynamics of SNNs and degrades the performance quickly with the decrease of inference time steps. One tempting method to model temporal correlations is to apply the same label supervision at each time step and treat them identically. Although it can acquire relatively consistent performance across various time steps, it still faces challenges in obtaining SNNs with high performance. Inspired by these observations, we propose Temporal-domain supervised Contrastive Learning (TCL) framework, a novel method to obtain SNNs with low latency and high performance by incorporating contrastive supervision with temporal domain information. Contrastive learning (CL) prompts the network to discern both consistency and variability in the representation space, enabling it to better learn discriminative and generalizable features. We extend this concept to the temporal domain of SNNs, allowing us to flexibly and fully leverage the correlation between representations at different time steps. Furthermore, we propose a Siamese Temporal-domain supervised Contrastive Learning (STCL) framework to enhance the SNNs via augmentation, temporal and class constraints simultaneously. Extensive experimental results demonstrate that SNNs trained by our TCL and STCL can achieve both high performance and low latency, achieving state-of-the-art performance on a variety of datasets (e.g., CIFAR-10, CIFAR-100, and DVS-CIFAR10).

25.2LGJun 12, 2020

FedGAN: Federated Generative Adversarial Networks for Distributed Data

Mohammad Rasouli, Tao Sun, Ram Rajagopal

We propose Federated Generative Adversarial Network (FedGAN) for training a GAN across distributed sources of non-independent-and-identically-distributed data sources subject to communication and privacy constraints. Our algorithm uses local generators and discriminators which are periodically synced via an intermediary that averages and broadcasts the generator and discriminator parameters. We theoretically prove the convergence of FedGAN with both equal and two time-scale updates of generator and discriminator, under standard assumptions, using stochastic approximations and communication efficient stochastic gradient descents. We experiment FedGAN on toy examples (2D system, mixed Gaussian, and Swiss role), image datasets (MNIST, CIFAR-10, and CelebA), and time series datasets (household electricity consumption and electric vehicle charging sessions). We show FedGAN converges and has similar performance to general distributed GAN, while reduces communication complexity. We also show its robustness to reduced communications.

6.5SDJul 27, 2019

Dilated FCN: Listening Longer to Hear Better

Shuyu Gong, Zhewei Wang, Tao Sun et al.

Deep neural network solutions have emerged as a new and powerful paradigm for speech enhancement (SE). The capabilities to capture long context and extract multi-scale patterns are crucial to design effective SE networks. Such capabilities, however, are often in conflict with the goal of maintaining compact networks to ensure good system generalization. In this paper, we explore dilation operations and apply them to fully convolutional networks (FCNs) to address this issue. Dilations equip the networks with greatly expanded receptive fields, without increasing the number of parameters. Different strategies to fuse multi-scale dilations, as well as to install the dilation modules are explored in this work. Using Noisy VCTK and AzBio sentences datasets, we demonstrate that the proposed dilation models significantly improve over the baseline FCN and outperform the state-of-the-art SE solutions.

1.8CVJan 9, 2019

TraceCaps: A Capsule-based Neural Network for Semantic Segmentation

Tao Sun, Zhewei Wang, C. D. Smith et al.

In this paper, we propose a capsule-based neural network model to solve the semantic segmentation problem. By taking advantage of the extractable part-whole dependencies available in capsule layers, we derive the probabilities of the class labels for individual capsules through a recursive, layer-by-layer procedure. We model this procedure as a traceback pipeline and take it as a central piece to build an end-to-end segmentation network. Under the proposed framework, image-level class labels and object boundaries are jointly sought in an explicit manner, which poses a significant advantage over the state-of-the-art fully convolutional network (FCN) solutions. With the capability to extracted part-whole information, our traceback pipeline can potentially be utilized as the building blocks to design interpretable neural networks. Experiments conducted on modified MNIST and neuroimages demonstrate that our model considerably enhance the segmentation performance compared to the leading FCN variants.

0.9CVNov 27, 2018

Quality-Aware Multimodal Saliency Detection via Deep Reinforcement Learning

Xiao Wang, Tao Sun, Rui Yang et al.

Incorporating various modes of information into the machine learning procedure is becoming a new trend. And data from various source can provide more information than single one no matter they are heterogeneous or homogeneous. Existing deep learning based algorithms usually directly concatenate features from each domain to represent the input data. Seldom of them take the quality of data into consideration which is a key issue in related multimodal problems. In this paper, we propose an efficient quality-aware deep neural network to model the weight of data from each domain using deep reinforcement learning (DRL). Specifically, we take the weighting of each domain as a decision-making problem and teach an agent learn to interact with the environment. The agent can tune the weight of each domain through discrete action selection and obtain a positive reward if the saliency results are improved. The target of the agent is to achieve maximum rewards after finished its sequential action selection. We validate the proposed algorithms on multimodal saliency detection in a coarse-to-fine way. The coarse saliency maps are generated from an encoder-decoder framework which is trained with content loss and adversarial loss. The final results can be obtained via adaptive weighting of maps from each domain. Experiments conducted on two kinds of salient object detection benchmarks validated the effectiveness of our proposed quality-aware deep neural network.