Jim Samuel

HC
7papers
487citations
Novelty19%
AI Score19

7 Papers

HCApr 25, 2022
Adaptive cognitive fit: Artificial intelligence augmented management of information facets and representations

Jim Samuel, Rajiv Kashyap, Yana Samuel et al.

Explosive growth in big data technologies and artificial intelligence [AI] applications have led to increasing pervasiveness of information facets and a rapidly growing array of information representations. Information facets, such as equivocality and veracity, can dominate and significantly influence human perceptions of information and consequently affect human performance. Extant research in cognitive fit, which preceded the big data and AI era, focused on the effects of aligning information representation and task on performance, without sufficient consideration to information facets and attendant cognitive challenges. Therefore, there is a compelling need to understand the interplay of these dominant information facets with information representations and tasks, and their influence on human performance. We suggest that artificially intelligent technologies that can adapt information representations to overcome cognitive limitations are necessary for these complex information environments. To this end, we propose and test a novel *Adaptive Cognitive Fit* [ACF] framework that explains the influence of information facets and AI-augmented information representations on human performance. We draw on information processing theory and cognitive dissonance theory to advance the ACF framework and a set of propositions. We empirically validate the ACF propositions with an economic experiment that demonstrates the influence of information facets, and a machine learning simulation that establishes the viability of using AI to improve human performance.

CLJun 15, 2021
Textual Data Distributions: Kullback Leibler Textual Distributions Contrasts on GPT-2 Generated Texts, with Supervised, Unsupervised Learning on Vaccine & Market Topics & Sentiment

Jim Samuel, Ratnakar Palle, Eduardo Correa Soares

Efficient textual data distributions (TDD) alignment and generation are open research problems in textual analytics and NLP. It is presently difficult to parsimoniously and methodologically confirm that two or more natural language datasets belong to similar distributions, and to identify the extent to which textual data possess alignment. This study focuses on addressing a segment of the broader problem described above by applying multiple supervised and unsupervised machine learning (ML) methods to explore the behavior of TDD by (i) topical alignment, and (ii) by sentiment alignment. Furthermore we use multiple text generation methods including fine-tuned GPT-2, to generate text by topic and by sentiment. Finally we develop a unique process driven variation of Kullback-Leibler divergence (KLD) application to TDD, named KL Textual Distributions Contrasts(KL-TDC) to identify the alignment of machine generated textual corpora with naturally occurring textual corpora. This study thus identifies a unique approach for generating and validating TDD by topic and sentiment, which can be used to help address sparse data problems and other research, practice and classroom situations in need of artificially generated topic or sentiment aligned textual data.

CYApr 19, 2021
Strategies for Democratization of Supercomputing: Availability, Accessibility and Usability of High Performance Computing for Education and Practice of Big Data Analytics

Jim Samuel, Margaret Brennan-Tonetta, Yana Samuel et al.

There has been an increasing interest in and growing need for high performance computing (HPC), popularly known as supercomputing, in domains such as textual analytics, business domains analytics, forecasting and natural language processing (NLP), in addition to the relatively mature supercomputing domains of quantum physics and biology. HPC has been widely used in computer science (CS) and other traditionally computation intensive disciplines, but has remained largely siloed away from the vast array of social, behavioral, business and economics disciplines. However, with ubiquitous big data, there is a compelling need to make HPC technologically and economically accessible, easy to use, and operationally democratized. Therefore, this research focuses on making two key contributions, the first is the articulation of strategies based on availability, accessibility and usability for the demystification and democratization of HPC, based on an analytical review of Caliburn, a notable supercomputer at its inception. The second contribution is a set of principles for HPC adoption based on an experiential narrative of HPC usage for textual analytics and NLP of social media data from a first time user perspective. Both, the HPC usage process and the output of the early stage analytics are summarized. This research study synthesizes expert input on HPC democratization strategies, and chronicles the challenges and opportunities from a multidisciplinary perspective, of a case of rapid adoption of supercomputing for textual analytics and NLP. Deductive logic is used to identify strategies which can lead to efficacious engagement, adoption, production and sustained usage for research, teaching, application and innovation by researchers, faculty, professionals and students across a broad range of disciplines.

IRMay 22, 2020
Feeling Like It is Time to Reopen Now? COVID-19 New Normal Scenarios based on Reopening Sentiment Analytics

Jim Samuel, Md. Mokhlesur Rahman, G. G. Md. Nawaz Ali et al.

The Coronavirus pandemic has created complex challenges and adverse circumstances. This research discovers public sentiment amidst problematic socioeconomic consequences of the lockdown, and explores ensuing four potential sentiment associated scenarios. The severity and brutality of COVID-19 have led to the development of extreme feelings, and emotional and mental healthcare challenges. This research identifies emotional consequences - the presence of extreme fear, confusion and volatile sentiments, mixed along with trust and anticipation. It is necessary to gauge dominant public sentiment trends for effective decisions and policies. This study analyzes public sentiment using Twitter Data, time-aligned to COVID-19, to identify dominant sentiment trends associated with the push to 'reopen' the economy. Present research uses textual analytics methodologies to analyze public sentiment support for two potential divergent scenarios - an early opening and a delayed opening, and consequences of each. Present research concludes on the basis of exploratory textual analytics and textual data visualization, that Tweets data from American Twitter users shows more trust sentiment support, than fear, for reopening the US economy. With additional validation, this could present a valuable time sensitive opportunity for state governments, the federal government, corporations and societal leaders to guide the nation into a successful new normal future.

IRMay 21, 2020
COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification

Jim Samuel, G. G. Md. Nawaz Ali, Md. Mokhlesur Rahman et al.

Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fueled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19's informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning (ML) classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91% for short Tweets, with the Naive Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities.

HCFeb 24, 2020
The Effects Of Technology Driven Information Categories On Performance In Electronic Trading Markets

Jim Samuel, Richard Holowczak, Alexander Pelaez

Electronic trading markets have evolved rapidly with continued adoption of new technologies and growing in-formation acquisition and processing capabilities. Traditional perspectives on trading performance adopted a mono-lithic view of information. Past research and practitioner heuristics posit that adopting new technologies and incorpo-rating more information should increase price efficiency and trading performance uniformity. However, along with technological change, information dynamics have evolved significantly resulting in immense growth in data volumes, and increased complexity of information categories. The present research explores behavioral trading performance under varying information category conditions and argues that unfettered technological developments and information consumption will not necessarily lead to consistent improvement in uniformity of trading performance. In this study, we employ an artificial stock market based economic experiment to examine the role of technol-ogy driven information categories in influencing trading decisions in electronic markets. Financial electronic markets are used as an information-rich mature markets representation to analyze information category driven trading perfor-mance. The results show that a variation of information categories can influence trading performance. The findings provide a basis to better understand behavioral phenomena in electronic markets and can be used to explain anomalies as well as to manage trading performance in electronic markets.

SIFeb 24, 2020
Automating Discovery of Dominance in Synchronous Computer-Mediated Communication

Jim Samuel, Richard Holowczak, Raquel Benbunan-Fich et al.

With the advent of electronic interaction, dominance (or the assertion of control over others) has acquired new dimensions. This study investigates the dynamics and characteristics of dominance in virtual interaction by analyzing electronic chat transcripts of groups solving a hidden profile task. We investigate computer-mediated communication behavior patterns that demonstrate dominance and identify a number of relevant variables. These indicators are calculated with automatic and manual coding of text transcripts. A comparison of both sets of variables indicates that automatic text analysis methods yield similar conclusions than manual coding. These findings are encouraging to advance research in text analysis methods in general, and in the study of virtual team dominance in particular.