K. Lakshmanan

6.8LGMar 15

Learning in Function Spaces: An Unified Functional Analytic View of Supervised and Unsupervised Learning

K. Lakshmanan

Many machine learning algorithms can be interpreted as procedures for estimating functions defined on the data distribution. In this paper we present a conceptual framework that formulates a wide range of learning problems as variational optimization over function spaces induced by the data distribution. Within this framework the data distribution defines operators that capture structural properties of the data, such as similarity relations or statistical dependencies. Learning algorithms can then be viewed as estimating functions expressed in bases determined by these operators. This perspective provides a unified way to interpret several learning paradigms. In supervised learning the objective functional is defined using labeled data and typically corresponds to minimizing prediction risk, whereas unsupervised learning relies on structural properties of the input distribution and leads to objectives based on similarity or smoothness constraints. From this viewpoint, the distinction between learning paradigms arises primarily from the choice of the functional being optimized rather than from the underlying function space. We illustrate this framework by discussing connections with kernel methods, spectral clustering, and manifold learning, highlighting how operators induced by data distributions naturally define function representations used by learning algorithms. The goal of this work is not to introduce a new algorithm but to provide a conceptual framework that clarifies the role of function spaces and operators in modern machine learning.

LGOct 23, 2017

Accelerated Reinforcement Learning

K. Lakshmanan

Policy gradient methods are widely used in reinforcement learning algorithms to search for better policies in the parameterized policy space. They do gradient search in the policy space and are known to converge very slowly. Nesterov developed an accelerated gradient search algorithm for convex optimization problems. This has been recently extended for non-convex and also stochastic optimization. We use Nesterov's acceleration for policy gradient search in the well-known actor-critic algorithm and show the convergence using ODE method. We tested this algorithm on a scheduling problem. Here an incoming job is scheduled into one of the four queues based on the queue lengths. We see from experimental results that algorithm using Nesterov's acceleration has significantly better performance compared to algorithm which do not use acceleration. To the best of our knowledge this is the first time Nesterov's acceleration has been used with actor-critic algorithm.

K. Lakshmanan

2 Papers