A Guide to Similarity Measures
It serves as an educational resource for practitioners in data science, but it is incremental as it compiles existing measures without introducing new methods.
The paper provides a comprehensive guide to prevalent similarity measures for data science applications, aiming to help both non-experts understand and use the measures and experts design better ones for specific tasks.
Similarity measures play a central role in various data science application domains for a wide assortment of tasks. This guide describes a comprehensive set of prevalent similarity measures to serve both non-experts and professional. Non-experts that wish to understand the motivation for a measure as well as how to use it may find a friendly and detailed exposition of the formulas of the measures, whereas experts may find a glance to the principles of designing similarity measures and ideas for a better way to measure similarity for their desired task in a given application domain.