Dmitry Kopitkov

LG
4papers
46citations
Novelty60%
AI Score26

4 Papers

LGOct 19, 2019
Neural Spectrum Alignment: Empirical Study

Dmitry Kopitkov, Vadim Indelman

Expressiveness and generalization of deep models was recently addressed via the connection between neural networks (NNs) and kernel learning, where first-order dynamics of NN during a gradient-descent (GD) optimization were related to gradient similarity kernel, also known as Neural Tangent Kernel (NTK). In the majority of works this kernel is considered to be time-invariant, with its properties being defined entirely by NN architecture and independent of the learning task at hand. In contrast, in this paper we empirically explore these properties along the optimization and show that in practical applications the NTK changes in a very dramatic and meaningful way, with its top eigenfunctions aligning toward the target function learned by NN. Moreover, these top eigenfunctions serve as basis functions for NN output - a function represented by NN is spanned almost completely by them for the entire optimization process. Further, since the learning along top eigenfunctions is typically fast, their alignment with the target function improves the overall optimization performance. In addition, we study how the neural spectrum is affected by learning rate decay, typically done by practitioners, showing various trends in the kernel behavior. We argue that the presented phenomena may lead to a more complete theoretical understanding behind NN learning.

ROJun 5, 2019
General Purpose Incremental Covariance Update and Efficient Belief Space Planning via Factor-Graph Propagation Action Tree

Dmitry Kopitkov, Vadim Indelman

Fast covariance calculation is required both for SLAM (e.g.~in order to solve data association) and for evaluating the information-theoretic term for different candidate actions in belief space planning (BSP). In this paper we make two primary contributions. First, we develop a novel general-purpose incremental covariance update technique, which efficiently recovers specific covariance entries after any change in the inference problem, such as introduction of new observations/variables or re-linearization of the state vector. Our approach is shown to recover them faster than other state-of-the-art methods. Second, we present a computationally efficient approach for BSP in high-dimensional state spaces, leveraging our incremental covariance update method. State of the art BSP approaches perform belief propagation for each candidate action and then evaluate an objective function that typically includes an information-theoretic term, such as entropy or information gain. Yet, candidate actions often have similar parts (e.g. common trajectory parts), which are however evaluated separately for each candidate. Moreover, calculating the information-theoretic term involves a costly determinant computation of the entire information (covariance) matrix which is O(n^3) with n being dimension of the state or costly Schur complement operations if only marginal posterior covariance of certain variables is of interest. Our approach, rAMDL-Tree, extends our previous BSP method rAMDL, by exploiting incremental covariance calculation and performing calculation re-use between common parts of non-myopic candidate actions, such that these parts are evaluated only once, in contrast to existing approaches.

LGMar 25, 2019
General Probabilistic Surface Optimization and Log Density Estimation

Dmitry Kopitkov, Vadim Indelman

In this paper we contribute a novel algorithm family, which generalizes many unsupervised techniques including unnormalized and energy models, and allows us to infer different statistical modalities (e.g. data likelihood and ratio between densities) from data samples. The proposed unsupervised technique, named Probabilistic Surface Optimization (PSO), views a model as a flexible surface which can be pushed according to loss-specific virtual stochastic forces, where a dynamical equilibrium is achieved when the pointwise forces on the surface become equal. Concretely, the surface is pushed up and down at points sampled from two different distributions. The averaged up and down forces become functions of these two distribution densities and of force magnitudes defined by the loss of a particular PSO instance. Upon convergence, the force equilibrium imposes an optimized model to be equal to various statistical functions depending on the used magnitude functions. Furthermore, this dynamical-statistical equilibrium is extremely intuitive and useful, providing many implications and possible usages in probabilistic inference. We connect PSO to numerous existing statistical works which are also PSO instances, and derive new PSO-based inference methods as demonstration of PSO exceptional usability. Likewise, based on the insights coming from the virtual-force perspective we analyze PSO stability and propose new ways to improve it. Finally, we present new instances of PSO, termed PSO-LDE, for data log-density estimation and also provide a new NN block-diagonal architecture for increased surface flexibility, which significantly improves estimation accuracy. Both PSO-LDE and the new architecture are combined together as a new density estimation technique. In our experiments we demonstrate this technique to be superior over state-of-the-art baselines in density estimation task for multimodal 20D data.

LGJul 27, 2018
Deep PDF: Probabilistic Surface Optimization and Density Estimation

Dmitry Kopitkov, Vadim Indelman

A probability density function (pdf) encodes the entire stochastic knowledge about data distribution, where data may represent stochastic observations in robotics, transition state pairs in reinforcement learning or any other empirically acquired modality. Inferring data pdf is of prime importance, allowing to analyze various model hypotheses and perform smart decision making. However, most density estimation techniques are limited in their representation expressiveness to specific kernel type or predetermined distribution family, and have other restrictions. For example, kernel density estimation (KDE) methods require meticulous parameter search and are extremely slow at querying new points. In this paper we present a novel non-parametric density estimation approach, DeepPDF, that uses a neural network to approximate a target pdf given samples from thereof. Such a representation provides high inference accuracy for a wide range of target pdfs using a relatively simple network structure, making our method highly statistically robust. This is done via a new stochastic optimization algorithm, \emph{Probabilistic Surface Optimization} (PSO), that turns to advantage the stochastic nature of sample points in order to force network output to be identical to the output of a target pdf. Once trained, query point evaluation can be efficiently done in DeepPDF by a simple network forward pass, with linear complexity in the number of query points. Moreover, the PSO algorithm is capable of inferring the frequency of data samples and may also be used in other statistical tasks such as conditional estimation and distribution transformation. We compare the derived approach with KDE methods showing its superior performance and accuracy.