LGFeb 20, 2022
Personalized Federated Learning with Exact Stochastic Gradient DescentSotirios Nikoloutsopoulos, Iordanis Koutsopoulos, Michalis K. Titsias
We propose a Stochastic Gradient Descent (SGD)-type algorithm for Personalized Federated Learning which can be particularly attractive for mobile energy-limited regimes due to its low per-client computational cost. The model to be trained includes a set of common weights for all clients, and a set of personalized weights that are specific to each client. At each optimization round, randomly selected clients perform multiple full gradient-descent updates over their client-specific weights towards optimizing the loss function on their own datasets, without updating the common weights. This procedure is energy-efficient since it has low computational cost per client. At the final update of each round, each client computes the joint gradient over both the client-specific and the common weights and returns the gradient of common weights to the server, which allows to perform an exact SGD step over the full set of weights in a distributed manner. For the overall optimization scheme, we rigorously prove convergence, even in non-convex settings such as those encountered when training neural networks, with a rate of $\mathcal{O} \left (\frac{1}{\sqrt{T}} \right )$ with respect to communication rounds $T$. In practice, PFLEGO exhibits substantially lower per-round wall-clock time, used as a proxy for energy. Our theoretical guarantees translate to superior performance in practice against baselines such as FedAvg and FedPer, as evaluated in several multi-class classification datasets, in particular, Omniglot, CIFAR-10, MNIST, Fashion-MNIST, and EMNIST.
LGSep 7, 2020
Information Theoretic Meta Learning with Gaussian ProcessesMichalis K. Titsias, Francisco J. R. Ruiz, Sotirios Nikoloutsopoulos et al.
We formulate meta learning using information theoretic concepts; namely, mutual information and the information bottleneck. The idea is to learn a stochastic representation or encoding of the task description, given by a training set, that is highly informative about predicting the validation set. By making use of variational approximations to the mutual information, we derive a general and tractable framework for meta learning. This framework unifies existing gradient-based algorithms and also allows us to derive new algorithms. In particular, we develop a memory-based algorithm that uses Gaussian processes to obtain non-parametric encoding representations. We demonstrate our method on a few-shot regression problem and on four few-shot classification problems, obtaining competitive accuracy when compared to existing baselines.
LGSep 30, 2018
Bayesian Transfer Reinforcement Learning with Prior Knowledge RulesMichalis K. Titsias, Sotirios Nikoloutsopoulos
We propose a probabilistic framework to directly insert prior knowledge in reinforcement learning (RL) algorithms by defining the behaviour policy as a Bayesian posterior distribution. Such a posterior combines task specific information with prior knowledge, thus allowing to achieve transfer learning across tasks. The resulting method is flexible and it can be easily incorporated to any standard off-policy and on-policy algorithms, such as those based on temporal differences and policy gradients. We develop a specific instance of this Bayesian transfer RL framework by expressing prior knowledge as general deterministic rules that can be useful in a large variety of tasks, such as navigation tasks. Also, we elaborate more on recent probabilistic and entropy-regularised RL by developing a novel temporal learning algorithm and show how to combine it with Bayesian transfer RL. Finally, we demonstrate our method for solving mazes and show that significant speed ups can be obtained.