Ayça Özçelikkale

LG
h-index32
15papers
63citations
Novelty42%
AI Score47

15 Papers

SPMar 7, 2022
Estimation under Model Misspecification with Fake Features

Martin Hellkvist, Ayça Özçelikkale, Anders Ahlén

We consider estimation under model misspecification where there is a model mismatch between the underlying system, which generates the data, and the model used during estimation. We propose a model misspecification framework which enables a joint treatment of the model misspecification types of having fake features as well as incorrect covariance assumptions on the unknowns and the noise. We present a decomposition of the output error into components that relate to different subsets of the model parameters corresponding to underlying, fake and missing features. Here, fake features are features which are included in the model but are not present in the underlying system. Under this framework, we characterize the estimation performance and reveal trade-offs between the number of samples, number of fake features, and the possibly incorrect noise level assumption. In contrast to existing work focusing on incorrect covariance assumptions or missing features, fake features is a central component of our framework. Our results show that fake features can significantly improve the estimation performance, even though they are not correlated with the features in the underlying system. In particular, we show that the estimation error can be decreased by including more fake features in the model, even to the point where the model is overparametrized, i.e., the model contains more unknowns than observations.

MLNov 30, 2022
Continual Learning with Distributed Optimization: Does CoCoA Forget?

Martin Hellkvist, Ayça Özçelikkale, Anders Ahlén

We focus on the continual learning problem where the tasks arrive sequentially and the aim is to perform well on the newly arrived task without performance degradation on the previously seen tasks. In contrast to the continual learning literature focusing on the centralized setting, we investigate the distributed estimation framework. We consider the well-established distributed learning algorithm COCOA. We derive closed form expressions for the iterations for the overparametrized case. We illustrate the convergence and the error performance of the algorithm based on the over/under-parameterization of the problem. Our results show that depending on the problem dimensions and data generation assumptions, COCOA can perform continual learning over a sequence of tasks, i.e., it can learn a new task without forgetting previously learned tasks, with access only to one task at a time.

LGDec 1, 2025
Delays in Spiking Neural Networks: A State Space Model Approach

Sanja Karilanova, Subhrakanti Dey, Ayça Özçelikkale

Spiking neural networks (SNNs) are biologically inspired, event-driven models that are suitable for processing temporal data and offer energy-efficient computation when implemented on neuromorphic hardware. In SNNs, richer neuronal dynamic allows capturing more complex temporal dependencies, with delays playing a crucial role by allowing past inputs to directly influence present spiking behavior. We propose a general framework for incorporating delays into SNNs through additional state variables. The proposed mechanism enables each neuron to access a finite temporal input history. The framework is agnostic to neuron models and hence can be seamlessly integrated into standard spiking neuron models such as LIF and adLIF. We analyze how the duration of the delays and the learnable parameters associated with them affect the performance. We investigate the trade-offs in the network architecture due to additional state variables introduced by the delay mechanism. Experiments on the Spiking Heidelberg Digits (SHD) dataset show that the proposed mechanism matches the performance of existing delay-based SNNs while remaining computationally efficient. Moreover, the results illustrate that the incorporation of delays may substantially improve performance in smaller networks.

LGDec 1, 2022
Regularization Trade-offs with Fake Features

Martin Hellkvist, Ayça Özçelikkale, Anders Ahlén

Recent successes of massively overparameterized models have inspired a new line of work investigating the underlying conditions that enable overparameterized models to generalize well. This paper considers a framework where the possibly overparametrized model includes fake features, i.e., features that are present in the model but not in the data. We present a non-asymptotic high-probability bound on the generalization error of the ridge regression problem under the model misspecification of having fake features. Our highprobability results provide insights into the interplay between the implicit regularization provided by the fake features and the explicit regularization provided by the ridge parameter. Numerical results illustrate the trade-off between the number of fake features and how the optimal ridge parameter may heavily depend on the number of fake features.

LGMay 14
Time-Varying Deep State Space Models for Sequences with Switching Dynamics

Sanja Karilanova, Subhrakanti Dey, Ayça Özçelikkale

The identification and modeling of time-varying systems is a fundamental challenge in signal processing and system identification. To address this challenge, we propose a class of time-varying state-space model (SSM) based neural networks in which the neurons' states are governed by time-varying dynamics. The proposed model provides the learnable time-varying dynamics through a dictionary of basis functions, where each basis function evolves differently over time. We evaluate the proposed approach on both synthetic data from switching systems and a speech denoising task where real audio is corrupted with switching dynamics noise. The results show that the proposed time-varying model consistently outperforms its time-invariant counterparts while maintaining comparable computational complexity. Our investigations also reveal which aspects of the time-varying dynamics of the data most need to be captured by the proposed time-invariant models, how the additional freedom provided by time-varying basis functions should be allocated across model components, and to what extent larger models can compensate for time-invariant limitations.

LGMay 14
Federated Learning of Spiking Neural Networks under Heterogeneous Temporal Resolutions

Sanja Karilanova, Subhrakanti Dey, Ayça Özçelikkale

Spiking neural networks (SNNs) are biologically inspired energy-efficient models that use sparse binary spike-based communication between neurons, making them attractive for resource-constrained edge devices. Federated learning enables such devices to train collaboratively without sharing raw data. In time-series applications, edge devices often collect data at different time resolutions due to hardware and energy constraints. This temporal heterogeneity poses a fundamental challenge for federated learning: parameters learned at one temporal resolution do not necessarily transfer directly to another, which might result in the naive federated averaging being ineffective. Targeting SNNs and, more broadly, deep networks with stateful neurons, we propose a federated learning framework that addresses this temporal resolution mismatch. We investigate how neuron parameters learned from data at different temporal resolutions and model aggregation should be integrated. We evaluate the proposed framework across two SNN-native benchmark datasets (SHD and DVS-Gesture) under a range of resolution heterogeneity scenarios. Our results show that the proposed adaptation methods can substantially recover accuracy lost due to temporal mismatch, hence enabling each client to train at their local temporal resolution while remaining compatible with the global model.

LGNov 7, 2024
Zero-Shot Temporal Resolution Domain Adaptation for Spiking Neural Networks

Sanja Karilanova, Maxime Fabre, Emre Neftci et al.

Spiking Neural Networks (SNNs) are biologically-inspired deep neural networks that efficiently extract temporal information while offering promising gains in terms of energy efficiency and latency when deployed on neuromorphic devices. However, SNN model parameters are sensitive to temporal resolution, leading to significant performance drops when the temporal resolution of target data at the edge is not the same with that of the pre-deployment source data used for training, especially when fine-tuning is not possible at the edge. To address this challenge, we propose three novel domain adaptation methods for adapting neuron parameters to account for the change in time resolution without re-training on target time-resolution. The proposed methods are based on a mapping between neuron dynamics in SNNs and State Space Models (SSMs); and are applicable to general neuron models. We evaluate the proposed methods under spatio-temporal data tasks, namely the audio keyword spotting datasets SHD and MSWC as well as the image classification NMINST dataset. Our methods provide an alternative to - and in majority of the cases significantly outperform - the existing reference method that simply scales the time constant. Moreover, our results show that high accuracy on high temporal resolution data can be obtained by time efficient training on lower temporal resolution data and model adaptation.

SPMay 3, 2025
Rate-Limited Closed-Loop Distributed ISAC Systems: An Autoencoder Approach

Guangjin Pan, Zhixing Li, Ayça Özçelikkale et al.

In closed-loop distributed multi-sensor integrated sensing and communication (ISAC) systems, performance often hinges on transmitting high-dimensional sensor observations over rate-limited networks. In this paper, we first present a general framework for rate-limited closed-loop distributed ISAC systems, and then propose an autoencoder-based observation compression method to overcome the constraints imposed by limited transmission capacity. Building on this framework, we conduct a case study using a closed-loop linear quadratic regulator (LQR) system to analyze how the interplay among observation, compression, and state dimensions affects reconstruction accuracy, state estimation error, and control performance. In multi-sensor scenarios, our results further show that optimal resource allocation initially prioritizes low-noise sensors until the compression becomes lossless, after which resources are reallocated to high-noise sensors.

LGApr 3, 2025
State-Space Model Inspired Multiple-Input Multiple-Output Spiking Neurons

Sanja Karilanova, Subhrakanti Dey, Ayça Özçelikkale

In spiking neural networks (SNNs), the main unit of information processing is the neuron with an internal state. The internal state generates an output spike based on its component associated with the membrane potential. This spike is then communicated to other neurons in the network. Here, we propose a general multiple-input multiple-output (MIMO) spiking neuron model that goes beyond this traditional single-input single-output (SISO) model in the SNN literature. Our proposed framework is based on interpreting the neurons as state-space models (SSMs) with linear state evolutions and non-linear spiking activation functions. We illustrate the trade-offs among various parameters of the proposed SSM-inspired neuron model, such as the number of hidden neuron states, the number of input and output channels, including single-input multiple-output (SIMO) and multiple-input single-output (MISO) models. We show that for SNNs with a small number of neurons with large internal state spaces, significant performance gains may be obtained by increasing the number of output channels of a neuron. In particular, a network with spiking neurons with multiple-output channels may achieve the same level of accuracy with the baseline with the continuous-valued communications on the same reference network architecture.

LGDec 4, 2023
Distributed Continual Learning with CoCoA in High-dimensional Linear Regression

Martin Hellkvist, Ayça Özçelikkale, Anders Ahlén

We consider estimation under scenarios where the signals of interest exhibit change of characteristics over time. In particular, we consider the continual learning problem where different tasks, e.g., data with different distributions, arrive sequentially and the aim is to perform well on the newly arrived task without performance degradation on the previously seen tasks. In contrast to the continual learning literature focusing on the centralized setting, we investigate the problem from a distributed estimation perspective. We consider the well-established distributed learning algorithm COCOA, which distributes the model parameters and the corresponding features over the network. We provide exact analytical characterization for the generalization error of COCOA under continual learning for linear regression in a range of scenarios, where overparameterization is of particular interest. These analytical results characterize how the generalization error depends on the network structure, the task similarity and the number of tasks, and show how these dependencies are intertwined. In particular, our results show that the generalization error can be significantly reduced by adjusting the network size, where the most favorable network size depends on task similarity and the number of tasks. We present numerical results verifying the theoretical analysis and illustrate the continual learning performance of COCOA with a digit classification task.

LGAug 8, 2025
Low-Bit Data Processing Using Multiple-Output Spiking Neurons with Non-linear Reset Feedback

Sanja Karilanova, Subhrakanti Dey, Ayça Özçelikkale

Neuromorphic computing is an emerging technology enabling low-latency and energy-efficient signal processing. A key algorithmic tool in neuromorphic computing is spiking neural networks (SNNs). SNNs are biologically inspired neural networks which utilize stateful neurons, and provide low-bit data processing by encoding and decoding information using spikes. Similar to SNNs, deep state-space models (SSMs) utilize stateful building blocks. However, deep SSMs, which recently achieved competitive performance in various temporal modeling tasks, are typically designed with high-precision activation functions and no reset mechanisms. To bridge the gains offered by SNNs and the recent deep SSM models, we propose a novel multiple-output spiking neuron model that combines a linear, general SSM state transition with a non-linear feedback mechanism through reset. Compared to the existing neuron models for SNNs, our proposed model clearly conceptualizes the differences between the spiking function, the reset condition and the reset action. The experimental results on various tasks, i.e., a keyword spotting task, an event-based vision task and a sequential pattern recognition task, show that our proposed model achieves performance comparable to existing benchmarks in the SNN literature. Our results illustrate how the proposed reset mechanism can overcome instability and enable learning even when the linear part of neuron dynamics is unstable, allowing us to go beyond the strictly enforced stability of linear dynamics in recent deep SSM models.

SPMay 25, 2021
Model Mismatch Trade-offs in LMMSE Estimation

Martin Hellkvist, Ayça Özçelikkale

We consider a linear minimum mean squared error (LMMSE) estimation framework with model mismatch where the assumed model order is smaller than that of the underlying linear system which generates the data used in the estimation process. By modelling the regressors of the underlying system as random variables, we analyze the average behaviour of the mean squared error (MSE). Our results quantify how the MSE depends on the interplay between the number of samples and the number of parameters in the underlying system and in the assumed model. In particular, if the number of samples is not sufficiently large, neither increasing the number of samples nor the assumed model complexity is sufficient to guarantee a performance improvement.

MLFeb 17, 2021
Chance-Constrained Active Inference

Thijs van de Laar, Ismail Senoz, Ayça Özçelikkale et al.

Active Inference (ActInf) is an emerging theory that explains perception and action in biological agents, in terms of minimizing a free energy bound on Bayesian surprise. Goal-directed behavior is elicited by introducing prior beliefs on the underlying generative model. In contrast to prior beliefs, which constrain all realizations of a random variable, we propose an alternative approach through chance constraints, which allow for a (typically small) probability of constraint violation, and demonstrate how such constraints can be used as intrinsic drivers for goal-directed behavior in ActInf. We illustrate how chance-constrained ActInf weights all imposed (prior) constraints on the generative model, allowing e.g., for a trade-off between robust control and empirical chance constraint violation. Secondly, we interpret the proposed solution within a message passing framework. Interestingly, the message passing interpretation is not only relevant to the context of ActInf, but also provides a general purpose approach that can account for chance constraints on graphical models. The chance constraint message updates can then be readily combined with other pre-derived message update rules, without the need for custom derivations. The proposed chance-constrained message passing framework thus accelerates the search for workable models in general, and can be used to complement message-passing formulations on generative neural models.

MLJan 22, 2021
Linear Regression with Distributed Learning: A Generalization Error Perspective

Martin Hellkvist, Ayça Özçelikkale, Anders Ahlén

Distributed learning provides an attractive framework for scaling the learning task by sharing the computational load over multiple nodes in a network. Here, we investigate the performance of distributed learning for large-scale linear regression where the model parameters, i.e., the unknowns, are distributed over the network. We adopt a statistical learning approach. In contrast to works that focus on the performance on the training data, we focus on the generalization error, i.e., the performance on unseen data. We provide high-probability bounds on the generalization error for both isotropic and correlated Gaussian data as well as sub-gaussian data. These results reveal the dependence of the generalization performance on the partitioning of the model over the network. In particular, our results show that the generalization error of the distributed solution can be substantially higher than that of the centralized solution even when the error on the training data is at the same level for both the centralized and distributed approaches. Our numerical results illustrate the performance with both real-world image data as well as synthetic data.

MLApr 30, 2020
Generalization Error for Linear Regression under Distributed Learning

Martin Hellkvist, Ayça Özçelikkale, Anders Ahlén

Distributed learning facilitates the scaling-up of data processing by distributing the computational burden over several nodes. Despite the vast interest in distributed learning, generalization performance of such approaches is not well understood. We address this gap by focusing on a linear regression setting. We consider the setting where the unknowns are distributed over a network of nodes. We present an analytical characterization of the dependence of the generalization error on the partitioning of the unknowns over nodes. In particular, for the overparameterized case, our results show that while the error on training data remains in the same range as that of the centralized solution, the generalization error of the distributed solution increases dramatically compared to that of the centralized solution when the number of unknowns estimated at any node is close to the number of observations. We further provide numerical examples to verify our analytical expressions.