MLNov 16, 2017Code
Robust Unsupervised Domain Adaptation for Neural Networks via Moment AlignmentWerner Zellinger, Bernhard A. Moser, Thomas Grubinger et al.
A novel approach for unsupervised domain adaptation for neural networks is proposed. It relies on metric-based regularization of the learning process. The metric-based regularization aims at domain-invariant latent feature representations by means of maximizing the similarity between domain-specific activation distributions. The proposed metric results from modifying an integral probability metric such that it becomes less translation-sensitive on a polynomial function space. The metric has an intuitive interpretation in the dual space as the sum of differences of higher order central moments of the corresponding activation distributions. Under appropriate assumptions on the input distributions, error minimization is proven for the continuous case. As demonstrated by an analysis of standard benchmark experiments for sentiment analysis, object recognition and digit recognition, the outlined approach is robust regarding parameter changes and achieves higher classification accuracies than comparable approaches. The source code is available at https://github.com/wzell/mann.
MLFeb 5, 2025
Gradient Descent Algorithm in Hilbert Spaces under Stationary Markov Chains with $φ$- and $β$-MixingPriyanka Roy, Susanne Saminger-Platz
In this paper, we study a strictly stationary Markov chain gradient descent algorithm operating in general Hilbert spaces. Our analysis focuses on the mixing coefficients of the underlying process, specifically the $φ$- and $β$-mixing coefficients. Under these assumptions, we derive probabilistic upper bounds on the convergence behavior of the algorithm based on the exponential as well as the polynomial decay of the mixing coefficients.
MLJul 8, 2025
Online Regularized Learning Algorithms in RKHS with $β$- and $φ$-Mixing SequencesPriyanka Roy, Susanne Saminger-Platz
In this paper, we study an online regularized learning algorithm in a reproducing kernel Hilbert spaces (RKHS) based on a class of dependent processes. We choose such a process where the degree of dependence is measured by mixing coefficients. As a representative example, we analyze a strictly stationary Markov chain, where the dependence structure is characterized by the \(φ\)- and \(β\)-mixing coefficients. Under these assumptions, we derive probabilistic upper bounds as well as convergence rates for both the exponential and polynomial decay of the mixing coefficients.
MLFeb 19, 2020
On generalization in moment-based domain adaptationWerner Zellinger, Bernhard A Moser, Susanne Saminger-Platz
Domain adaptation algorithms are designed to minimize the misclassification risk of a discriminative model for a target domain with little training data by adapting a model from a source domain with a large amount of training data. Standard approaches measure the adaptation discrepancy based on distance measures between the empirical probability distributions in the source and target domain. In this setting, we address the problem of deriving generalization bounds under practice-oriented general conditions on the underlying probability distributions. As a result, we obtain generalization bounds for domain adaptation based on finitely many moments and smoothness conditions.
MLFeb 28, 2017
Central Moment Discrepancy (CMD) for Domain-Invariant Representation LearningWerner Zellinger, Thomas Grubinger, Edwin Lughofer et al.
The learning of domain-invariant representations in the context of domain adaptation with neural networks is considered. We propose a new regularization method that minimizes the discrepancy between domain-specific latent feature representations directly in the hidden activation space. Although some standard distribution matching approaches exist that can be interpreted as the matching of weighted sums of moments, e.g. Maximum Mean Discrepancy (MMD), an explicit order-wise matching of higher order moments has not been considered before. We propose to match the higher order central moments of probability distributions by means of order-wise moment differences. Our model does not require computationally expensive distance and kernel matrix computations. We utilize the equivalent representation of probability distributions by moment sequences to define a new distance function, called Central Moment Discrepancy (CMD). We prove that CMD is a metric on the set of probability distributions on a compact interval. We further prove that convergence of probability distributions on compact intervals w.r.t. the new metric implies convergence in distribution of the respective random variables. We test our approach on two different benchmark data sets for object recognition (Office) and sentiment analysis of product reviews (Amazon reviews). CMD achieves a new state-of-the-art performance on most domain adaptation tasks of Office and outperforms networks trained with MMD, Variational Fair Autoencoders and Domain Adversarial Neural Networks on Amazon reviews. In addition, a post-hoc parameter sensitivity analysis shows that the new approach is stable w.r.t. parameter changes in a certain interval. The source code of the experiments is publicly available.