SI LGOct 3, 2012

Learning from Collective Intelligence in Groups

Guo-Jun Qi, Charu Aggarwal, Pierre Moulin, Thomas Huang

arXiv:1210.0954v11 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving data quality in collective intelligence systems, which is crucial for applications relying on aggregated crowd information, though it appears incremental as it builds on existing methods for source reliability.

The paper tackles the problem of unreliable information sources degrading collective intelligence by proposing a probabilistic model that assesses source reliability and identifies true data, revealing latent group structures among dependent sources to prevent overuse of redundant information. Experimental results on real-world datasets demonstrate the approach's effectiveness compared to existing algorithms.

Collective intelligence, which aggregates the shared information from large crowds, is often negatively impacted by unreliable information sources with the low quality data. This becomes a barrier to the effective use of collective intelligence in a variety of applications. In order to address this issue, we propose a probabilistic model to jointly assess the reliability of sources and find the true data. We observe that different sources are often not independent of each other. Instead, sources are prone to be mutually influenced, which makes them dependent when sharing information with each other. High dependency between sources makes collective intelligence vulnerable to the overuse of redundant (and possibly incorrect) information from the dependent sources. Thus, we reveal the latent group structure among dependent sources, and aggregate the information at the group level rather than from individual sources directly. This can prevent the collective intelligence from being inappropriately dominated by dependent sources. We will also explicitly reveal the reliability of groups, and minimize the negative impacts of unreliable groups. Experimental results on real-world data sets show the effectiveness of the proposed approach with respect to existing algorithms.

View on arXiv PDF

Similar