HCFeb 3, 2022
Privacy-Aware Crowd Labelling for Machine Learning TasksGiannis Haralabopoulos, Ioannis Anagnostopoulos
The extensive use of online social media has highlighted the importance of privacy in the digital space. As more scientists analyse the data created in these platforms, privacy concerns have extended to data usage within the academia. Although text analysis is a well documented topic in academic literature with a multitude of applications, ensuring privacy of user-generated content has been overlooked. Most sentiment analysis methods require emotion labels, which can be obtained through crowdsourcing, where non-expert individuals contribute to scientific tasks. The text itself has to be exposed to third parties in order to be labelled. In an effort to reduce the exposure of online users' information, we propose a privacy preserving text labelling method for varying applications, based in crowdsourcing. We transform text with different levels of privacy, and analyse the effectiveness of the transformation with regards to label correlation and consistency. Our results suggest that privacy can be implemented in labelling, retaining the annotational diversity and subjectivity of traditional labelling.
IRJan 30, 2018
Modeling Influence with Semantics in Social Networks: a SurveyGerasimos Razis, Ioannis Anagnostopoulos, Sherali Zeadally
The discovery of influential entities in all kinds of networks (e.g. social, digital, or computer) has always been an important field of study. In recent years, Online Social Networks (OSNs) have been established as a basic means of communication and often influencers and opinion makers promote politics, events, brands or products through viral content. In this work, we present a systematic review across i) online social influence metrics, properties, and applications and ii) the role of semantic in modeling OSNs information. We end up with the conclusion that both areas can jointly provide useful insights towards the qualitative assessment of viral user-generated content, as well as for modeling the dynamic properties of influential content and its flow dynamics.
IRSep 18, 2014
Exploratory Analysis of a Terabyte Scale Web CorpusVasilis Kolias, Ioannis Anagnostopoulos, Eleftherios Kayafas
In this paper we present a preliminary analysis over the largest publicly accessible web dataset: the Common Crawl Corpus. We measure nine web characteristics from two levels of granularity using MapReduce and we comment on the initial observations over a fraction of it. To the best of our knowledge two of the characteristics, the language distribution and the HTML version of pages have not been analyzed in previous work, while the specific dataset has been only analyzed on page level.
SISep 12, 2014
Semantifying Twitter: the influenceTracker ontologyGerasimos Razis, Ioannis Anagnostopoulos
In this paper, we propose an ontology schema towards semantification provision of Twitter social analytics. The ontology is deployed over a publicly available service that measures how influential a Twitter account is, by combining its social activity and interaction over Twittersphere. Apart from influential quantity and quality measures, the service provides a SPARQL endpoint where users can perform advance semantic queries through the RDFized Twitter entities (mentions, replies, hashtags, photos, URLs) over the semantic graph.
SIMar 6, 2014
Lifespan and propagation of information in On-line Social Networks a Case StudyGiannis Haralabopoulos, Ioannis Anagnostopoulos
Since 1950, information flows have been in the centre of scientific research. Up until internet penetration in the late 90s, these studies were based over traditional offline social networks. Several observations in offline information flows studies, such as two-step flow of communication and the importance of weak ties, were verified in several online studies, showing that the diffused information flows from one Online Social Network (OSN) to several others. Within that flow, information is shared to and reproduced by the users of each network. Furthermore, the original content is enhanced or weakened according to its topic, the dynamic and exposure of each OSNs. In such a concept, each OSN is considered a layer of information flows that interacts with each other. In this paper, we examine such flows in several social networks, as well as their diffusion and lifespan across multiple OSNs, in terms of user-generated content. Our results verify the perception of content and information connection in various OSNs.