IRJul 27, 2020

Measuring similarity in co-occurrence data using ego-networks

arXiv:2007.13273v12 citations

Originality Incremental advance

AI Analysis

This work provides an incremental improvement in network-based similarity measures for applications like social networks, ecosystems, and brain networks.

The authors tackled the problem of measuring similarity in co-occurrence data by proposing a new ego-network-based index that avoids unwanted in-directed relationships, and found it outperforms traditional network-based measures and sometimes surpasses embedding methods in performance.

The co-occurrence association is widely observed in many empirical data. Mining the information in co-occurrence data is essential for advancing our understanding of systems such as social networks, ecosystem, and brain network. Measuring similarity of entities is one of the important tasks, which can usually be achieved using a network-based approach. Here we show that traditional methods based on the aggregated network can bring unwanted in-directed relationship. To cope with this issue, we propose a similarity measure based on the ego network of each entity, which effectively considers the change of an entity's centrality from one ego network to another. The index proposed is easy to calculate and has a clear physical meaning. Using two different data sets, we compare the new index with other existing ones. We find that the new index outperforms the traditional network-based similarity measures, and it can sometimes surpass the embedding method. In the meanwhile, the measure by the new index is weakly correlated with those by other methods, hence providing a different dimension to quantify similarities in co-occurrence data. Altogether, our work makes an extension in the network-based similarity measure and can be potentially applied in several related tasks.

View on arXiv PDF

Similar