Olumide Ebenezer Ojo

h-index5

6papers

478citations

Novelty18%

AI Score20

Ranked #185,186 of 194,257 authors (top 95%)#29,958 in CL (top 97%)

6 Papers

1.7CLMar 13, 2023

Transformer-based approaches to Sentiment Detection

Olumide Ebenezer Ojo, Hoang Thang Ta, Alexander Gelbukh et al.

The use of transfer learning methods is largely responsible for the present breakthrough in Natural Learning Processing (NLP) tasks across multiple domains. In order to solve the problem of sentiment detection, we examined the performance of four different types of well-known state-of-the-art transformer models for text classification. Models such as Bidirectional Encoder Representations from Transformers (BERT), Robustly Optimized BERT Pre-training Approach (RoBERTa), a distilled version of BERT (DistilBERT), and a large bidirectional neural network architecture (XLNet) were proposed. The performance of the four models that were used to detect disaster in the text was compared. All the models performed well enough, indicating that transformer-based models are suitable for the detection of disaster in text. The RoBERTa transformer model performs best on the test dataset with a score of 82.6% and is highly recommended for quality predictions. Furthermore, we discovered that the learning algorithms' performance was influenced by the pre-processing techniques, the nature of words in the vocabulary, unbalanced labeling, and the model parameters.

21.3CLOct 14, 2023

Legend at ArAIEval Shared Task: Persuasion Technique Detection using a Language-Agnostic Text Representation Model

Olumide E. Ojo, Olaronke O. Adebanji, Hiram Calvo et al.

In this paper, we share our best performing submission to the Arabic AI Tasks Evaluation Challenge (ArAIEval) at ArabicNLP 2023. Our focus was on Task 1, which involves identifying persuasion techniques in excerpts from tweets and news articles. The persuasion technique in Arabic texts was detected using a training loop with XLM-RoBERTa, a language-agnostic text representation model. This approach proved to be potent, leveraging fine-tuning of a multilingual language model. In our evaluation of the test set, we achieved a micro F1 score of 0.64 for subtask A of the competition.

1.3CLOct 19, 2023

MedAI Dialog Corpus (MEDIC): Zero-Shot Classification of Doctor and AI Responses in Health Consultations

Olumide E. Ojo, Olaronke O. Adebanji, Alexander Gelbukh et al.

Zero-shot classification enables text to be classified into classes not seen during training. In this study, we examine the efficacy of zero-shot learning models in classifying healthcare consultation responses from Doctors and AI systems. The models evaluated include BART, BERT, XLM, XLM-R and DistilBERT. The models were tested on three different datasets based on a binary and multi-label analysis to identify the origins of text in health consultations without any prior corpus training. According to our findings, the zero-shot language models show a good understanding of language generally, but has limitations when trying to classify doctor and AI responses to healthcare consultations. This research provides a foundation for future research in the field of medical text classification by informing the development of more accurate methods of classifying text written by Doctors and AI systems in health consultations.

1.0CLFeb 6, 2024

Evaluating Embeddings for One-Shot Classification of Doctor-AI Consultations

Olumide Ebenezer Ojo, Olaronke Oluwayemisi Adebanji, Alexander Gelbukh et al.

Effective communication between healthcare providers and patients is crucial to providing high-quality patient care. In this work, we investigate how Doctor-written and AI-generated texts in healthcare consultations can be classified using state-of-the-art embeddings and one-shot classification systems. By analyzing embeddings such as bag-of-words, character n-grams, Word2Vec, GloVe, fastText, and GPT2 embeddings, we examine how well our one-shot classification systems capture semantic information within medical consultations. Results show that the embeddings are capable of capturing semantic features from text in a reliable and adaptable manner. Overall, Word2Vec, GloVe and Character n-grams embeddings performed well, indicating their suitability for modeling targeted to this task. GPT2 embedding also shows notable performance, indicating its suitability for models tailored to this task as well. Our machine learning architectures significantly improved the quality of health conversations when training data are scarce, improving communication between patients and healthcare providers.

26.4CLJan 25, 2024

MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms

Patrick Lee, Alain Chirino Trujillo, Diana Cuevas Plancarte et al.

This study investigates the computational processing of euphemisms, a universal linguistic phenomenon, across multiple languages. We train a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases where multilingual models perform better on the task compared to monolingual models by a statistically significant margin, indicating that multilingual data presents additional opportunities for models to learn about cross-lingual, computational properties of euphemisms. In a follow-up analysis, we focus on universal euphemistic "categories" such as death and bodily functions among others. We test to see whether cross-lingual data of the same domain is more important than within-language data of other domains to further understand the nature of the cross-lingual transfer.

26.3CLMay 31, 2023

FEED PETs: Further Experimentation and Expansion on the Disambiguation of Potentially Euphemistic Terms

Patrick Lee, Iyanuoluwa Shode, Alain Chirino Trujillo et al.

Transformers have been shown to work well for the task of English euphemism disambiguation, in which a potentially euphemistic term (PET) is classified as euphemistic or non-euphemistic in a particular context. In this study, we expand on the task in two ways. First, we annotate PETs for vagueness, a linguistic property associated with euphemisms, and find that transformers are generally better at classifying vague PETs, suggesting linguistic differences in the data that impact performance. Second, we present novel euphemism corpora in three different languages: Yoruba, Spanish, and Mandarin Chinese. We perform euphemism disambiguation experiments in each language using multilingual transformer models mBERT and XLM-RoBERTa, establishing preliminary results from which to launch future work.