LOAIAug 27, 2019

Ordered Sets for Data Analysis

arXiv:1908.11341v1
Originality Synthesis-oriented
AI Analysis

It provides a foundational approach for researchers and practitioners in data analysis and machine learning, though it appears incremental by applying existing order theory concepts to various domains.

The book addresses mathematical and algorithmic issues in data analysis by leveraging order theory to handle complex data structures and compare classifiers by generality, analyzing computational complexity for scalability in big data applications.

This book dwells on mathematical and algorithmic issues of data analysis based on generality order of descriptions and respective precision. To speak of these topics correctly, we have to go some way getting acquainted with the important notions of relation and order theory. On the one hand, data often have a complex structure with natural order on it. On the other hand, many symbolic methods of data analysis and machine learning allow to compare the obtained classifiers w.r.t. their generality, which is also an order relation. Efficient algorithms are very important in data analysis, especially when one deals with big data, so scalability is a real issue. That is why we analyze the computational complexity of algorithms and problems of data analysis. We start from the basic definitions and facts of algorithmic complexity theory and analyze the complexity of various tools of data analysis we consider. The tools and methods of data analysis, like computing taxonomies, groups of similar objects (concepts and n-clusters), dependencies in data, classification, etc., are illustrated with applications in particular subject domains, from chemoinformatics to text mining and natural language processing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes