SI CL CY IROct 27, 2016

Word Embeddings to Enhance Twitter Gang Member Profile Identification

Sanjaya Wijeratne, Lakshika Balasuriya, Derek Doran, Amit Sheth

arXiv:1610.08597v16.623 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge for law enforcement agencies in detecting gang-related activity on social media, but it is incremental as it builds on previous work.

The paper tackles the problem of identifying gang members on Twitter by using word embeddings to represent user content, and finds that pre-trained word embeddings improve the accuracy of supervised learning algorithms for this task.

Gang affiliates have joined the masses who use social media to share thoughts and actions publicly. Interestingly, they use this public medium to express recent illegal actions, to intimidate others, and to share outrageous images and statements. Agencies able to unearth these profiles may thus be able to anticipate, stop, or hasten the investigation of gang-related crimes. This paper investigates the use of word embeddings to help identify gang members on Twitter. Building on our previous work, we generate word embeddings that translate what Twitter users post in their profile descriptions, tweets, profile images, and linked YouTube content to a real vector format amenable for machine learning classification. Our experimental results show that pre-trained word embeddings can boost the accuracy of supervised learning algorithms trained over gang members social media posts.

View on arXiv PDF

Similar