LG ITSep 27, 2016

Correct classification for big/smart/fast data machine learning

arXiv:1609.08550v1

Originality Synthesis-oriented

AI Analysis

This addresses classification tasks in predictive analytics and data science, but it appears incremental as it applies an existing mathematical framework (Boolean function minimization) to a known problem.

The paper tackles the problem of table data classification for big/smart/fast data machine learning by proposing to view it as minimization of Boolean functions, showing that data can be transformed into this form and leveraging existing algorithms, with binary output used for simplicity to enable future multivalued extensions.

Table (database) / Relational database Classification for big/smart/fast data machine learning is one of the most important tasks of predictive analytics and extracting valuable information from data. It is core applied technique for what now understood under data science and/or artificial intelligence. Widely used Decision Tree (Random Forest) and rare used rule based PRISM , VFST, etc classifiers are empirical substitutions of theoretically correct to use Boolean functions minimization. Developing Minimization of Boolean functions algorithms is started long time ago by Edward Veitch's 1952. Since it, big efforts by wide scientific/industrial community was done to find feasible solution of Boolean functions minimization. In this paper we propose consider table data classification from mathematical point of view, as minimization of Boolean functions. It is shown that data representation may be transformed to Boolean functions form and how to use known algorithms. For simplicity, binary output function is used for development, what opens doors for multivalued outputs developments.

View on arXiv PDF

Similar