SICYIRMay 5, 2015

Scraping and Clustering Techniques for the Characterization of Linkedin Profiles

arXiv:1505.00989v120 citations
Originality Synthesis-oriented
AI Analysis

This work provides data-driven insights into professional social networks for researchers and analysts, but it is incremental as it applies existing methods to a new dataset.

The authors scraped around 5 million public LinkedIn profiles and applied NLP techniques to classify educational backgrounds and cluster professional backgrounds, revealing insights into user demographics and relationships between education and careers.

The socialization of the web has undertaken a new dimension after the emergence of the Online Social Networks (OSN) concept. The fact that each Internet user becomes a potential content creator entails managing a big amount of data. This paper explores the most popular professional OSN: LinkedIn. A scraping technique was implemented to get around 5 Million public profiles. The application of natural language processing techniques (NLP) to classify the educational background and to cluster the professional background of the collected profiles led us to provide some insights about this OSN's users and to evaluate the relationships between educational degrees and professional careers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes