Ghazaleh Beigi

h-index17

9papers

315citations

Novelty45%

AI Score24

Ranked #170,996 of 194,257 authors (top 88%)#5,044 in CR (top 74%)

9 Papers

1.2SIJan 4, 2020

Social Science Guided Feature Engineering: A Novel Approach to Signed Link Analysis

Ghazaleh Beigi, Jiliang Tang, Huan Liu

Many real-world relations can be represented by signed networks with positive links (e.g., friendships and trust) and negative links (e.g., foes and distrust). Link prediction helps advance tasks in social network analysis such as recommendation systems. Most existing work on link analysis focuses on unsigned social networks. The existence of negative links piques research interests in investigating whether properties and principles of signed networks differ from those of unsigned networks, and mandates dedicated efforts on link analysis for signed social networks. Recent findings suggest that properties of signed networks substantially differ from those of unsigned networks and negative links can be of significant help in signed link analysis in complementary ways. In this article, we center our discussion on a challenging problem of signed link analysis. Signed link analysis faces the problem of data sparsity, i.e. only a small percentage of signed links are given. This problem can even get worse when negative links are much sparser than positive ones as users are inclined more towards positive disposition rather than negative. We investigate how we can take advantage of other sources of information for signed link analysis. This research is mainly guided by three social science theories, Emotional Information, Diffusion of Innovations, and Individual Personality. Guided by these, we extract three categories of related features and leverage them for signed link analysis. Experiments show the significance of the features gleaned from social theories for signed link prediction and addressing the data sparsity challenge.

17.6SINov 22, 2019

Privacy-Aware Recommendation with Private-Attribute Protection using Adversarial Learning

Ghazaleh Beigi, Ahmadreza Mosallanezhad, Ruocheng Guo et al.

Recommendation is one of the critical applications that helps users find information relevant to their interests. However, a malicious attacker can infer users' private information via recommendations. Prior work obfuscates user-item data before sharing it with recommendation system. This approach does not explicitly address the quality of recommendation while performing data obfuscation. Moreover, it cannot protect users against private-attribute inference attacks based on recommendations. This work is the first attempt to build a Recommendation with Attribute Protection (RAP) model which simultaneously recommends relevant items and counters private-attribute inference attacks. The key idea of our approach is to formulate this problem as an adversarial learning problem with two main components: the private attribute inference attacker, and the Bayesian personalized recommender. The attacker seeks to infer users' private-attribute information according to their items list and recommendations. The recommender aims to extract users' interests while employing the attacker to regularize the recommendation process. Experiments show that the proposed model both preserves the quality of recommendation service and protects users against private-attribute inference attacks.

20.5CRJul 6, 2019

I Am Not What I Write: Privacy Preserving Text Representation Learning

Ghazaleh Beigi, Kai Shu, Ruocheng Guo et al.

Online users generate tremendous amounts of textual information by participating in different activities, such as writing reviews and sharing tweets. This textual data provides opportunities for researchers and business partners to study and understand individuals. However, this user-generated textual data not only can reveal the identity of the user but also may contain individual's private information (e.g., age, location, gender). Hence, "you are what you write" as the saying goes. Publishing the textual data thus compromises the privacy of individuals who provided it. The need arises for data publishers to protect people's privacy by anonymizing the data before publishing it. It is challenging to design effective anonymization techniques for textual information which minimizes the chances of re-identification and does not contain users' sensitive information (high privacy) while retaining the semantic meaning of the data for given tasks (high utility). In this paper, we study this problem and propose a novel double privacy preserving text representation learning framework, DPText, which learns a textual representation that (1) is differentially private, (2) does not contain private information and (3) retains high utility for the given task. Evaluating on two natural language processing tasks, i.e., sentiment analysis and part of speech tagging, we show the effectiveness of this approach in terms of preserving both privacy and utility.

5.1SIMar 6, 2019

Signed Link Prediction with Sparse Data: The Role of Personality Information

Ghazaleh Beigi, Suhas Ranganath, Huan Liu

Predicting signed links in social networks often faces the problem of signed link data sparsity, i.e., only a small percentage of signed links are given. The problem is exacerbated when the number of negative links is much smaller than that of positive links. Boosting signed link prediction necessitates additional information to compensate for data sparsity. According to psychology theories, one rich source of such information is user's personality such as optimism and pessimism that can help determine her propensity in establishing positive and negative links. In this study, we investigate how personality information can be obtained, and if personality information can help alleviate the data sparsity problem for signed link prediction. We propose a novel signed link prediction model that enables empirical exploration of user personality via social media data. We evaluate our proposed model on two datasets of real-world signed link networks. The results demonstrate the complementary role of personality information in the signed link prediction problem. Experimental results also indicate the effectiveness of different levels of personality information for signed link data sparsity problem.

11.6CRNov 23, 2018

Protecting User Privacy: An Approach for Untraceable Web Browsing History and Unambiguous User Profiles

Ghazaleh Beigi, Ruocheng Guo, Alexander Nou et al.

The overturning of the Internet Privacy Rules by the Federal Communications Commissions (FCC) in late March 2017 allows Internet Service Providers (ISPs) to collect, share and sell their customers' Web browsing data without their consent. With third-party trackers embedded on Web pages, this new rule has put user privacy under more risk. The need arises for users on their own to protect their Web browsing history from any potential adversaries. Although some available solutions such as Tor, VPN, and HTTPS can help users conceal their online activities, their use can also significantly hamper personalized online services, i.e., degraded utility. In this paper, we design an effective Web browsing history anonymization scheme, PBooster, aiming to protect users' privacy while retaining the utility of their Web browsing history. The proposed model pollutes users' Web browsing history by automatically inferring how many and what links should be added to the history while addressing the utility-privacy trade-off challenge. We conduct experiments to validate the quality of the manipulated Web browsing history and examine the robustness of the proposed approach for user privacy protection.

23.6CRAug 7, 2018

Privacy in Social Media: Identification, Mitigation and Applications

Ghazaleh Beigi, Huan Liu

The increasing popularity of social media has attracted a huge number of people to participate in numerous activities on a daily basis. This results in tremendous amounts of rich user-generated data. This data provides opportunities for researchers and service providers to study and better understand users' behaviors and further improve the quality of the personalized services. Publishing user-generated data risks exposing individuals' privacy. Users privacy in social media is an emerging task and has attracted increasing attention in recent years. These works study privacy issues in social media from the two different points of views: identification of vulnerabilities, and mitigation of privacy risks. Recent research has shown the vulnerability of user-generated data against the two general types of attacks, identity disclosure and attribute disclosure. These privacy issues mandate social media data publishers to protect users' privacy by sanitizing user-generated data before publishing it. Consequently, various protection techniques have been proposed to anonymize user-generated social media data. There is a vast literature on privacy of users in social media from many perspectives. In this survey, we review the key achievements of user privacy in social media. In particular, we review and compare the state-of-the-art algorithms in terms of the privacy leakage attacks and anonymization algorithms. We overview the privacy risks from different aspects of social media and categorize the relevant works into five groups 1) graph data anonymization and de-anonymization, 2) author identification, 3) profile attribute disclosure, 4) user location and privacy, and 5) recommender systems and privacy issues. We also discuss open problems and future research directions for user privacy issues in social media.

5.8CRJun 26, 2018

Social Media and User Privacy

Ghazaleh Beigi

Online users generate tremendous amounts of data. To better serve users, it is required to share the user-related data among researchers, advertisers and application developers. Publishing such data would raise more concerns on user privacy. To encourage data sharing and mitigate user privacy concerns, a number of anonymization and de-anonymization algorithms have been developed to help protect privacy of users. This paper reviews my doctoral research on online users privacy specifically in social media. In particular, I propose a new adversarial attack specialized for social media data. I further provide a principled way to assess effectiveness of anonymizing different aspects of social media data. My work sheds light on new privacy risks in social media data due to innate heterogeneity of user-generated data.

11.6CRMay 1, 2018

Securing Social Media User Data - An Adversarial Approach

Ghazaleh Beigi, Kai Shu, Yanchao Zhang et al.

Social media users generate tremendous amounts of data. To better serve users, it is required to share the user-related data among researchers, advertisers and application developers. Publishing such data would raise more concerns on user privacy. To encourage data sharing and mitigate user privacy concerns, a number of anonymization and de-anonymization algorithms have been developed to help protect privacy of social media users. In this work, we propose a new adversarial attack specialized for social media data. We further provide a principled way to assess effectiveness of anonymizing different aspects of social media data. Our work sheds light on new privacy risks in social media data due to innate heterogeneity of user-generated data which require striking balance between sharing user data and protecting user privacy.

9.7SIMar 12, 2018

Ghazaleh Beigi, Huan Liu

The pervasive use of social media provides massive data about individuals' online social activities and their social relations. The building block of most existing recommendation systems is the similarity between users with social relations, i.e., friends. While friendship ensures some homophily, the similarity of a user with her friends can vary as the number of friends increases. Research from sociology suggests that friends are more similar than strangers, but friends can have different interests. Exogenous information such as comments and ratings may help discern different degrees of agreement (i.e., congruity) among similar users. In this paper, we investigate if users' congruity can be incorporated into recommendation systems to improve it's performance. Experimental results demonstrate the effectiveness of embedding congruity related information into recommendation systems.