MLOct 22, 2017

Elliptical modeling and pattern analysis for perturbation models and classfication

arXiv:1710.07939v1
Originality Incremental advance
AI Analysis

This work addresses data privacy and security issues in applications like network intrusion detection and biological data analysis by providing a more secure and effective transformation method compared to linear techniques like PCA.

The paper tackled performance degradation in machine learning when using transformed features from perturbation models by proposing a nonlinear parametric perturbation model that converts input features to elliptical patterns. The method outperformed PCA in classification accuracy and data privacy protection on network intrusion and biological datasets, with specific improvements in blind source separation attack resistance and signal interference ratio.

The characteristics (or numerical patterns) of a feature vector in the transform domain of a perturbation model differ significantly from those of its corresponding feature vector in the input domain. These differences - caused by the perturbation techniques used for the transformation of feature patterns - degrade the performance of machine learning techniques in the transform domain. In this paper, we proposed a nonlinear parametric perturbation model that transforms the input feature patterns to a set of elliptical patterns, and studied the performance degradation issues associated with random forest classification technique using both the input and transform domain features. Compared with the linear transformation such as Principal Component Analysis (PCA), the proposed method requires less statistical assumptions and is highly suitable for the applications such as data privacy and security due to the difficulty of inverting the elliptical patterns from the transform domain to the input domain. In addition, we adopted a flexible block-wise dimensionality reduction step in the proposed method to accommodate the possible high-dimensional data in modern applications. We evaluated the empirical performance of the proposed method on a network intrusion data set and a biological data set, and compared the results with PCA in terms of classification performance and data privacy protection (measured by the blind source separation attack and signal interference ratio). Both results confirmed the superior performance of the proposed elliptical transformation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes