Zoleikha Jahanbakhsh-Nagadeh

CL
4papers
60citations
Novelty36%
AI Score20

4 Papers

CLJun 9, 2021
Phraseformer: Multimodal Key-phrase Extraction using Transformer and Graph Embedding

Narjes Nikzad-Khasmakhi, Mohammad-Reza Feizi-Derakhshi, Meysam Asgari-Chenaghlu et al.

Background: Keyword extraction is a popular research topic in the field of natural language processing. Keywords are terms that describe the most relevant information in a document. The main problem that researchers are facing is how to efficiently and accurately extract the core keywords from a document. However, previous keyword extraction approaches have utilized the text and graph features, there is the lack of models that can properly learn and combine these features in a best way. Methods: In this paper, we develop a multimodal Key-phrase extraction approach, namely Phraseformer, using transformer and graph embedding techniques. In Phraseformer, each keyword candidate is presented by a vector which is the concatenation of the text and structure learning representations. Phraseformer takes the advantages of recent researches such as BERT and ExEm to preserve both representations. Also, the Phraseformer treats the key-phrase extraction task as a sequence labeling problem solved using classification task. Results: We analyze the performance of Phraseformer on three datasets including Inspec, SemEval2010 and SemEval 2017 by F1-score. Also, we investigate the performance of different classifiers on Phraseformer method over Inspec dataset. Experimental results demonstrate the effectiveness of Phraseformer method over the three datasets used. Additionally, the Random Forest classifier gain the highest F1-score among all classifiers. Conclusions: Due to the fact that the combination of BERT and ExEm is more meaningful and can better represent the semantic of words. Hence, Phraseformer significantly outperforms single-modality methods.

CLJul 9, 2020
Automatic Personality Prediction; an Enhanced Method Using Ensemble Modeling

Majid Ramezani, Mohammad-Reza Feizi-Derakhshi, Mohammad-Ali Balafar et al.

Human personality is significantly represented by those words which he/she uses in his/her speech or writing. As a consequence of spreading the information infrastructures (specifically the Internet and social media), human communications have reformed notably from face to face communication. Generally, Automatic Personality Prediction (or Perception) (APP) is the automated forecasting of the personality on different types of human generated/exchanged contents (like text, speech, image, video, etc.). The major objective of this study is to enhance the accuracy of APP from the text. To this end, we suggest five new APP methods including term frequency vector-based, ontology-based, enriched ontology-based, latent semantic analysis (LSA)-based, and deep learning-based (BiLSTM) methods. These methods as the base ones, contribute to each other to enhance the APP accuracy through ensemble modeling (stacking) based on a hierarchical attention network (HAN) as the meta-model. The results show that ensemble modeling enhances the accuracy of APP.

SIFeb 18, 2020
A Model to Measure the Spread Power of Rumors

Zoleikha Jahanbakhsh-Nagadeh, Mohammad-Reza Feizi-Derakhshi, Majid Ramezani et al.

With technologies that have democratized the production and reproduction of information, a significant portion of daily interacted posts in social media has been infected by rumors. Despite the extensive research on rumor detection and verification, so far, the problem of calculating the spread power of rumors has not been considered. To address this research gap, the present study seeks a model to calculate the Spread Power of Rumor (SPR) as the function of content-based features in two categories: False Rumor (FR) and True Rumor (TR). For this purpose, the theory of Allport and Postman will be adopted, which it claims that importance and ambiguity are the key variables in rumor-mongering and the power of rumor. Totally 42 content features in two categories "importance" (28 features) and "ambiguity" (14 features) are introduced to compute SPR. The proposed model is evaluated on two datasets, Twitter and Telegram. The results showed that (i) the spread power of False Rumor documents is rarely more than True Rumors. (ii) there is a significant difference between the SPR means of two groups False Rumor and True Rumor. (iii) SPR as a criterion can have a positive impact on distinguishing False Rumors and True Rumors.

CLJan 12, 2019
A Speech Act Classifier for Persian Texts and its Application in Identifying Rumors

Zoleikha Jahanbakhsh-Nagadeh, Mohammad-Reza Feizi-Derakhshi, Arash Sharifi

Speech Acts (SAs) are one of the important areas of pragmatics, which give us a better understanding of the state of mind of the people and convey an intended language function. Knowledge of the SA of a text can be helpful in analyzing that text in natural language processing applications. This study presents a dictionary-based statistical technique for Persian SA recognition. The proposed technique classifies a text into seven classes of SA based on four criteria: lexical, syntactic, semantic, and surface features. WordNet as the tool for extracting synonym and enriching features dictionary is utilized. To evaluate the proposed technique, we utilized four classification methods including Random Forest (RF), Support Vector Machine (SVM), Naive Bayes (NB), and K-Nearest Neighbors (KNN). The experimental results demonstrate that the proposed method using RF and SVM as the best classifiers achieved a state-of-the-art performance with an accuracy of 0.95 for classification of Persian SAs. Our original vision of this work is introducing an application of SA recognition on social media content, especially the common SA in rumors. Therefore, the proposed system utilized to determine the common SAs in rumors. The results showed that Persian rumors are often expressed in three SA classes including narrative, question, and threat, and in some cases with the request SA.