Analyzing Tag Distributions in Folksonomies for Resource Classification
This work provides insights for researchers and developers working on tag-based classification and social tagging systems, but it is incremental as it builds on existing methods to analyze known effects.
The authors investigated how different settings in social tagging systems affect tag distributions in folksonomies and their impact on resource classification, finding that tag suggestions significantly alter distributions and influence the effectiveness of TF-IDF weighting schemes.
Recent research has shown the usefulness of social tags as a data source to feed resource classification. Little is known about the effect of settings on folksonomies created on social tagging systems. In this work, we consider the settings of social tagging systems to further understand tag distributions in folksonomies. We analyze in depth the tag distributions on three large-scale social tagging datasets, and analyze the effect on a resource classification task. To this end, we study the appropriateness of applying weighting schemes based on the well-known TF-IDF for resource classification. We show the great importance of settings as to altering tag distributions. Among those settings, tag suggestions produce very different folksonomies, which condition the success of the employed weighting schemes. Our findings and analyses are relevant for researchers studying tag-based resource classification, user behavior in social networks, the structure of folksonomies and tag distributions, as well as for developers of social tagging systems in search of an appropriate setting.