MEAPMLSep 24, 2020

Parsimonious Feature Extraction Methods: Extending Robust Probabilistic Projections with Generalized Skew-t

arXiv:2009.11499v1
Originality Incremental advance
AI Analysis

This work provides an incremental improvement for researchers and practitioners in machine learning and finance dealing with skewed, heavy-tailed data and missing values.

The authors tackled the problem of feature extraction for data with asymmetric distributions and missing values by extending robust probabilistic projections with a generalized skew-t framework, achieving a more flexible approach to modeling tail dependence and separating error and factor effects, as demonstrated on a dataset of high-market-cap cryptocurrencies.

We propose a novel generalisation to the Student-t Probabilistic Principal Component methodology which: (1) accounts for an asymmetric distribution of the observation data; (2) is a framework for grouped and generalised multiple-degree-of-freedom structures, which provides a more flexible approach to modelling groups of marginal tail dependence in the observation data; and (3) separates the tail effect of the error terms and factors. The new feature extraction methods are derived in an incomplete data setting to efficiently handle the presence of missing values in the observation vector. We discuss various special cases of the algorithm being a result of simplified assumptions on the process generating the data. The applicability of the new framework is illustrated on a data set that consists of crypto currencies with the highest market capitalisation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes