Muhammad Abulaish

2papers

2 Papers

CLMar 13, 2023
A Comprehensive Empirical Evaluation of Existing Word Embedding Approaches

Obaidullah Zaland, Muhammad Abulaish, Mohd. Fazil

Vector-based word representations help countless Natural Language Processing (NLP) tasks capture the language's semantic and syntactic regularities. In this paper, we present the characteristics of existing word embedding approaches and analyze them with regard to many classification tasks. We categorize the methods into two main groups - Traditional approaches mostly use matrix factorization to produce word representations, and they are not able to capture the semantic and syntactic regularities of the language very well. On the other hand, Neural-network-based approaches can capture sophisticated regularities of the language and preserve the word relationships in the generated word representations. We report experimental results on multiple classification tasks and highlight the scenarios where one approach performs better than the rest.

SIJan 22, 2018
A Novel Weighted Distance Measure for Multi-Attributed Graph

Muhammad Abulaish, Jahiruddin

Due to exponential growth of complex data, graph structure has become increasingly important to model various entities and their interactions, with many interesting applications including, bioinformatics, social network analysis, etc. Depending on the complexity of the data, the underlying graph model can be a simple directed/undirected and/or weighted/un-weighted graph to a complex graph (aka multi-attributed graph) where vertices and edges are labelled with multi-dimensional vectors. In this paper, we present a novel weighted distance measure based on weighted Euclidean norm which is defined as a function of both vertex and edge attributes, and it can be used for various graph analysis tasks including classification and cluster analysis. The proposed distance measure has flexibility to increase/decrease the weightage of edge labels while calculating the distance between vertex-pairs. We have also proposed a MAGDist algorithm, which reads multi-attributed graph stored in CSV files containing the list of vertex vectors and edge vectors, and calculates the distance between each vertex-pair using the proposed weighted distance measure. Finally, we have proposed a multi-attributed similarity graph generation algorithm, MAGSim, which reads the output of MAGDist algorithm and generates a similarity graph that can be analysed using classification and clustering algorithms. The significance and accuracy of the proposed distance measure and algorithms is evaluated on Iris and Twitter data sets, and it is found that the similarity graph generated by our proposed method yields better clustering results than the existing similarity graph generation methods.