AIJun 7, 2020
An Algorithm for Fuzzification of WordNets, Supported by a Mathematical ProofSayyed-Ali Hossayni, Mohammad-R Akbarzadeh-T, Diego Reforgiato Recupero et al.
WordNet-like Lexical Databases (WLDs) group English words into sets of synonyms called "synsets." Although the standard WLDs are being used in many successful Text-Mining applications, they have the limitation that word-senses are considered to represent the meaning associated to their corresponding synsets, to the same degree, which is not generally true. In order to overcome this limitation, several fuzzy versions of synsets have been proposed. A common trait of these studies is that, to the best of our knowledge, they do not aim to produce fuzzified versions of the existing WLD's, but build new WLDs from scratch, which has limited the attention received from the Text-Mining community, many of whose resources and applications are based on the existing WLDs. In this study, we present an algorithm for constructing fuzzy versions of WLDs of any language, given a corpus of documents and a word-sense disambiguation (WSD) system for that language. Then, using the Open-American-National-Corpus and UKB WSD as algorithm inputs, we construct and publish online the fuzzified version of English WordNet (FWN). We also propose a theoretical (mathematical) proof of the validity of its results.
OTDec 29, 2019
A generalization of the symmetrical and optimal probability-to-possibility transformationsEsteve del Acebo, Yousef Alizadeh-Q, Sayyed Ali Hossayni
Possibility and probability theories are alternative and complementary ways to deal with uncertainty, which has motivated over the last years an interest for the study of ways to transform probability distributions into possibility distributions and conversely. This paper studies the advantages and shortcomings of two well-known discrete probability to possibility transformations: the optimal transformation and the symmetrical transformation, and presents a novel parametric family of probability to possibility transformations which generalizes them and alleviate their shortcomings, showing a big potential for practical application. The paper also introduces a novel fuzzy measure of specificity for probability distributions based on the concept of fuzzy subsethood and presents a empirical validation of the generalized transformation usefulness applying it to the text authorship attribution problem.
CLJan 26, 2019
A Linear-complexity Multi-biometric Forensic Document Analysis System, by Fusing the Stylome and Signature ModalitiesSayyed-Ali Hossayni, Yousef Alizadeh-Q, Vahid Tavana et al.
Forensic Document Analysis (FDA) addresses the problem of finding the authorship of a given document. Identification of the document writer via a number of its modalities (e.g. handwriting, signature, linguistic writing style (i.e. stylome), etc.) has been studied in the FDA state-of-the-art. But, no research is conducted on the fusion of stylome and signature modalities. In this paper, we propose such a bimodal FDA system (which has vast applications in judicial, police-related, and historical documents analysis) with a focus on time-complexity. The proposed bimodal system can be trained and tested with linear time complexity. For this purpose, we first revisit Multinomial Naïve Bayes (MNB), as the best state-of-the-art linear-complexity authorship attribution system and, then, prove its superior accuracy to the well-known linear-complexity classifiers in the state-of-the-art. Then, we propose a fuzzy version of MNB for being fused with a state-of-the-art well-known linear-complexity fuzzy signature recognition system. For the evaluation purposes, we construct a chimeric dataset, composed of signatures and textual contents of different letters. Despite its linear-complexity, the proposed multi-biometric system is proven to meaningfully improve its state-of-the-art unimodal counterparts, regarding the accuracy, F-Score, Detection Error Trade-off (DET), Cumulative Match Characteristics (CMC), and Match Score Histograms (MSH) evaluation metrics.