Patrick Meier

CYFeb 26, 2016

Enabling Digital Health by Automatic Classification of Short Messages

Muhammad Imran, Patrick Meier, Carlos Castillo et al.

In response to the growing HIV/AIDS and other health-related issues, UNICEF through their U-Report platform receives thousands of messages (SMS) every day to provide prevention strategies, health case advice, and counsel- ing support to vulnerable population. Due to a rapid increase in U-Report usage (up to 300% in last 3 years), plus approximately 1,000 new registrations each day, the volume of messages has thus continued to increase, which made it impossible for the team at UNICEF to process them in a timely manner. In this paper, we present a platform designed to perform automatic classification of short messages (SMS) in real-time to help UNICEF categorize and prioritize health-related messages as they arrive. We employ a hybrid approach, which combines human and machine intelligence that seeks to resolve the information overload issue by introducing processing of large-scale data at high-speed while maintaining a high classification accuracy. The system has recently been tested in conjunction with UNICEF in Zambia to classify short messages received via the U-Report platform on various health related issues. The system is designed to enable UNICEF make sense of a large volume of short messages in a timely manner. In terms of evaluation, we report design choices, challenges, and performance of the system observed during the deployment to validate its effectiveness.

CRMay 21, 2014

TweetCred: Real-Time Credibility Assessment of Content on Twitter

Aditi Gupta, Ponnurangam Kumaraguru, Carlos Castillo et al.

During sudden onset crisis events, the presence of spam, rumors and fake content on Twitter reduces the value of information contained on its messages (or "tweets"). A possible solution to this problem is to use machine learning to automatically evaluate the credibility of a tweet, i.e. whether a person would deem the tweet believable or trustworthy. This has been often framed and studied as a supervised classification problem in an off-line (post-hoc) setting. In this paper, we present a semi-supervised ranking model for scoring tweets according to their credibility. This model is used in TweetCred, a real-time system that assigns a credibility score to tweets in a user's timeline. TweetCred, available as a browser plug-in, was installed and used by 1,127 Twitter users within a span of three months. During this period, the credibility score for about 5.4 million tweets was computed, allowing us to evaluate TweetCred in terms of response time, effectiveness and usability. To the best of our knowledge, this is the first research work to develop a real-time system for credibility on Twitter, and to evaluate it on a user base of this size.

Patrick Meier

2 Papers