LGMay 2, 2017

Pointed subspace approach to incomplete data

Łukasz Struski, Marek Śmieja, Jacek Tabor

arXiv:1705.00840v10.71 citations

Originality Incremental advance

AI Analysis

This provides a more flexible framework for processing incomplete data in machine learning applications, though it appears incremental as it builds on existing flag vector representations.

The paper tackles the problem of handling incomplete data by representing it as pointed affine subspaces, enabling affine transformations like whitening and dimensionality reduction, and embeds these subspaces into a vector space for use with standard classification methods.

Incomplete data are often represented as vectors with filled missing attributes joined with flag vectors indicating missing components. In this paper we generalize this approach and represent incomplete data as pointed affine subspaces. This allows to perform various affine transformations of data, as whitening or dimensionality reduction. We embed such generalized missing data into a vector space by mapping pointed affine subspace (generalized missing data point) to a vector containing imputed values joined with a corresponding projection matrix. Such an operation preserves the scalar product of the embedding defined for flag vectors and allows to input transformed incomplete data to typical classification methods.

View on arXiv PDF

Similar