Using tf-idf as an edge weighting scheme in user-object bipartite networks
This work addresses a specific challenge in web and e-commerce data analysis, offering an incremental improvement for network projection methods.
The paper tackled the problem of edge inflation in projected user networks from bipartite user-object networks by proposing a tf-idf-based edge weighting scheme, which improved community structure density and quality across five real-world datasets.
Bipartite user-object networks are becoming increasingly popular in representing user interaction data in a web or e-commerce environment. They have certain characteristics and challenges that differentiates them from other bipartite networks. This paper analyzes the properties of five real world user-object networks. In all cases we found a heavy tail object degree distribution with popular objects connecting together a large part of the users causing significant edge inflation in the projected users network. We propose a novel edge weighting strategy based on tf-idf and show that the new scheme improves both the density and the quality of the community structure in the projections. The improvement is also noticed when comparing to partially random networks.