MLLGCOAug 9, 2017

Using Deep Neural Networks to Automate Large Scale Statistical Analysis for Big Data Applications

arXiv:1708.03027v1
Originality Incremental advance
AI Analysis

This work addresses the problem of automating complex statistical analysis for big data applications, which is incremental as it applies existing deep learning methods to a known bottleneck in data science.

The authors tackled the challenge of automating statistical analysis for big data by using deep neural networks, specifically convolutional neural networks, to perform model selection and parameter estimation, with simulation studies showing excellent performance.

Statistical analysis (SA) is a complex process to deduce population properties from analysis of data. It usually takes a well-trained analyst to successfully perform SA, and it becomes extremely challenging to apply SA to big data applications. We propose to use deep neural networks to automate the SA process. In particular, we propose to construct convolutional neural networks (CNNs) to perform automatic model selection and parameter estimation, two most important SA tasks. We refer to the resulting CNNs as the neural model selector and the neural model estimator, respectively, which can be properly trained using labeled data systematically generated from candidate models. Simulation study shows that both the selector and estimator demonstrate excellent performances. The idea and proposed framework can be further extended to automate the entire SA process and have the potential to revolutionize how SA is performed in big data analytics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes