Distance Geometry and Data Science
This is an incremental survey that addresses the challenge of converting graph data into vector form for data science applications, benefiting researchers and practitioners in machine learning and data analysis.
The paper surveys the problem of mapping graphs to vectors, relating it to mathematical programming, and demonstrates that distance geometry techniques can achieve competitive performance compared to traditional graph-to-vector mappings in neural networks.
Data are often represented as graphs. Many common tasks in data science are based on distances between entities. While some data science methodologies natively take graphs as their input, there are many more that take their input in vectorial form. In this survey we discuss the fundamental problem of mapping graphs to vectors, and its relation with mathematical programming. We discuss applications, solution methods, dimensional reduction techniques and some of their limits. We then present an application of some of these ideas to neural networks, showing that distance geometry techniques can give competitive performance with respect to more traditional graph-to-vector mappings.