Anna Sepliarskaia

LG
h-index6
5papers
70citations
Novelty40%
AI Score26

5 Papers

LGOct 9, 2022
LieGG: Studying Learned Lie Group Generators

Artem Moskalev, Anna Sepliarskaia, Ivan Sosnovik et al.

Symmetries built into a neural network have appeared to be very beneficial for a wide range of tasks as it saves the data to learn them. We depart from the position that when symmetries are not built into a model a priori, it is advantageous for robust networks to learn symmetries directly from the data to fit a task function. In this paper, we present a method to extract symmetries learned by a neural network and to evaluate the degree to which a network is invariant to them. With our method, we are able to explicitly retrieve learned invariances in a form of the generators of corresponding Lie-groups without prior knowledge of symmetries in the data. We use the proposed method to study how symmetrical properties depend on a neural network's parameterization and configuration. We found that the ability of a network to learn symmetries generalizes over a range of architectures. However, the quality of learned symmetries depends on the depth and the number of parameters.

LGAug 7, 2023
On genuine invariance learning without weight-tying

Artem Moskalev, Anna Sepliarskaia, Erik J. Bekkers et al.

In this paper, we investigate properties and limitations of invariance learned by neural networks from the data compared to the genuine invariance achieved through invariant weight-tying. To do so, we adopt a group theoretical perspective and analyze invariance learning in neural networks without weight-tying constraints. We demonstrate that even when a network learns to correctly classify samples on a group orbit, the underlying decision-making in such a model does not attain genuine invariance. Instead, learned invariance is strongly conditioned on the input data, rendering it unreliable if the input distribution shifts. We next demonstrate how to guide invariance learning toward genuine invariance by regularizing the invariance of a model at the training. To this end, we propose several metrics to quantify learned invariance: (i) predictive distribution invariance, (ii) logit invariance, and (iii) saliency invariance similarity. We show that the invariance learned with the invariance error regularization closely reassembles the genuine invariance of weight-tying models and reliably holds even under a severe input distribution shift. Closer analysis of the learned invariance also reveals the spectral decay phenomenon, when a network chooses to achieve the invariance to a specific transformation group by reducing the sensitivity to any input perturbation.

LGDec 19, 2022
VC dimensions of group convolutional neural networks

Philipp Christian Petersen, Anna Sepliarskaia

We study the generalization capacity of group convolutional neural networks. We identify precise estimates for the VC dimensions of simple sets of group convolutional neural networks. In particular, we find that for infinite groups and appropriately chosen convolutional kernels, already two-parameter families of convolutional neural networks have an infinite VC dimension, despite being invariant to the action of an infinite group.

LGOct 21, 2024
On the VC dimension of deep group convolutional neural networks

Anna Sepliarskaia, Sophie Langer, Johannes Schmidt-Hieber

We study the generalization capabilities of Group Convolutional Neural Networks (GCNNs) with ReLU activation function by deriving upper and lower bounds for their Vapnik-Chervonenkis (VC) dimension. Specifically, we analyze how factors such as the number of layers, weights, and input dimension affect the VC dimension. We further compare the derived bounds to those known for other types of neural networks. Our findings extend previous results on the VC dimension of continuous GCNNs with two layers, thereby providing new insights into the generalization properties of GCNNs, particularly regarding the dependence on the input resolution of the data.

LGOct 12, 2019
How to Not Measure Disentanglement

Anna Sepliarskaia, Julia Kiseleva, Maarten de Rijke

To evaluate disentangled representations several metrics have been proposed. However, theoretical guarantees for conventional metrics of disentanglement are missing. Moreover, conventional metrics do not have a consistent correlation with the outcomes of qualitative studies. In this paper we analyze metrics of disentanglement and their properties. We conclude that existing metrics of disentanglement were created to reflect different characteristics of disentanglement and do not satisfy two basic desirable properties: (1) assign a high score to representations that are disentangled according to the definition; and (2) assign a low score to representations that are entangled according to the definition. In addition, we propose a new metric of disentanglement and prove that it satisfies both of the properties.