Jason M. Pittman

CR
h-index18
11papers
26citations
Novelty27%
AI Score29

11 Papers

CRFeb 18, 2023
Reproducing Random Forest Efficacy in Detecting Port Scanning

Jason M. Pittman

Port scanning is the process of attempting to connect to various network ports on a computing endpoint to determine which ports are open and which services are running on them. It is a common method used by hackers to identify vulnerabilities in a network or system. By determining which ports are open, an attacker can identify which services and applications are running on a device and potentially exploit any known vulnerabilities in those services. Consequently, it is important to detect port scanning because it is often the first step in a cyber attack. By identifying port scanning attempts, cybersecurity professionals can take proactive measures to protect the systems and networks before an attacker has a chance to exploit any vulnerabilities. Against this background, researchers have worked for over a decade to develop robust methods to detect port scanning. One such method revealed by a recent systematic review is the random forest supervised machine learning algorithm. The review revealed six existing studies using random forest since 2021. Unfortunately, those studies each exhibit different results, do not all use the same training and testing dataset, and only two include source code. Accordingly, the goal of this work was to reproduce the six random forest studies while addressing the apparent shortcomings. The outcomes are significant for researchers looking to explore random forest to detect port scanning and for practitioners interested in reliable technology to detect the early stages of cyber attack.

AIJul 20, 2024
A Measure for Level of Autonomy Based on Observable System Behavior

Jason M. Pittman

Contemporary artificial intelligence systems are pivotal in enhancing human efficiency and safety across various domains. One such domain is autonomous systems, especially in automotive and defense use cases. Artificial intelligence brings learning and enhanced decision-making to autonomy system goal-oriented behaviors and human independence. However, the lack of clear understanding of autonomy system capabilities hampers human-machine or machine-machine interaction and interdiction. This necessitates varying degrees of human involvement for safety, accountability, and explainability purposes. Yet, measuring the level autonomous capability in an autonomous system presents a challenge. Two scales of measurement exist, yet measuring autonomy presupposes a variety of elements not available in the wild. This is why existing measures for level of autonomy are operationalized only during design or test and evaluation phases. No measure for level of autonomy based on observed system behavior exists at this time. To address this, we outline a potential measure for predicting level of autonomy using observable actions. We also present an algorithm incorporating the proposed measure. The measure and algorithm have significance to researchers and practitioners interested in a method to blind compare autonomous systems at runtime. Defense-based implementations are likewise possible because counter-autonomy depends on robust identification of autonomous systems.

CLOct 28, 2025
Towards a Method for Synthetic Generation of Persons with Aphasia Transcripts

Jason M. Pittman, Anton Phillips, Yesenia Medina-Santos et al.

In aphasia research, Speech-Language Pathologists (SLPs) devote extensive time to manually coding speech samples using Correct Information Units (CIUs), a measure of how informative an individual sample of speech is. Developing automated systems to recognize aphasic language is limited by data scarcity. For example, only about 600 transcripts are available in AphasiaBank yet billions of tokens are used to train large language models (LLMs). In the broader field of machine learning (ML), researchers increasingly turn to synthetic data when such are sparse. Therefore, this study constructs and validates two methods to generate synthetic transcripts of the AphasiaBank Cat Rescue picture description task. One method leverages a procedural programming approach while the second uses Mistral 7b Instruct and Llama 3.1 8b Instruct LLMs. The methods generate transcripts across four severity levels (Mild, Moderate, Severe, Very Severe) through word dropping, filler insertion, and paraphasia substitution. Overall, we found, compared to human-elicited transcripts, Mistral 7b Instruct best captures key aspects of linguistic degradation observed in aphasia, showing realistic directional changes in NDW, word count, and word length amongst the synthetic generation methods. Based on the results, future work should plan to create a larger dataset, fine-tune models for better aphasic representation, and have SLPs assess the realism and usefulness of the synthetic transcripts.

CRFeb 26, 2025
Truth in Text: A Meta-Analysis of ML-Based Cyber Information Influence Detection Approaches

Jason M. Pittman

Cyber information influence, or disinformation in general terms, is widely regarded as one of the biggest threats to social progress and government stability. From US presidential elections to European Union referendums and down to regional news reporting of wildfires, lies and post-truths have normalized radical decision-making. Accordingly, there has been an explosion in research seeking to detect disinformation in online media. The frontier of disinformation detection research is leveraging a variety of ML techniques such as traditional ML algorithms like Support Vector Machines, Random Forest, and Naïve Bayes. Other research has applied deep learning models including Convolutional Neural Networks, Long Short-Term Memory networks, and transformer-based architectures. Despite the overall success of such techniques, the literature demonstrates inconsistencies when viewed holistically which limits our understanding of the true effectiveness. Accordingly, this work employed a two-stage meta-analysis to (a) demonstrate an overall meta statistic for ML model effectiveness in detecting disinformation and (b) investigate the same by subgroups of ML model types. The study found the majority of the 81 ML detection techniques sampled have greater than an 80\% accuracy with a Mean sample effectiveness of 79.18\% accuracy. Meanwhile, subgroups demonstrated no statistically significant difference between-approaches but revealed high within-group variance. Based on the results, this work recommends future work in replication and development of detection methods operating at the ML model level.

LGDec 26, 2024
Latenrgy: Model Agnostic Latency and Energy Consumption Prediction for Binary Classifiers

Jason M. Pittman

Machine learning systems increasingly drive innovation across scientific fields and industry, yet challenges in compute overhead, specifically during inference, limit their scalability and sustainability. Responsible AI guardrails, essential for ensuring fairness, transparency, and privacy, further exacerbate these computational demands. This study addresses critical gaps in the literature, chiefly the lack of generalized predictive techniques for latency and energy consumption, limited cross-comparisons of classifiers, and unquantified impacts of RAI guardrails on inference performance. Using Theory Construction Methodology, this work constructed a model-agnostic theoretical framework for predicting latency and energy consumption in binary classification models during inference. The framework synthesizes classifier characteristics, dataset properties, and RAI guardrails into a unified analytical instrument. Two predictive equations are derived that capture the interplay between these factors while offering generalizability across diverse classifiers. The proposed framework provides foundational insights for designing efficient, responsible ML systems. It enables researchers to benchmark and optimize inference performance and assists practitioners in deploying scalable solutions. Finally, this work establishes a theoretical foundation for balancing computational efficiency with ethical AI principles, paving the way for future empirical validation and broader applications.

CYNov 6, 2020
Detecting Synthetic Phenomenology in a Contained Artificial General Intelligence

Jason M. Pittman, Ashlyn Hanks

Human-like intelligence in a machine is a contentious subject. Whether mankind should or should not pursue the creation of artificial general intelligence is hotly debated. As well, researchers have aligned in opposing factions according to whether mankind can create it. For our purposes, we assume mankind can and will do so. Thus, it becomes necessary to contemplate how to do so in a safe and trusted manner -- enter the idea of boxing or containment. As part of such thinking, we wonder how a phenomenology might be detected given the operational constraints imposed by any potential containment system. Accordingly, this work provides an analysis of existing measures of phenomenology through qualia and extends those ideas into the context of a contained artificial general intelligence.

CRNov 1, 2020
Primer -- A Tool for Testing Honeypot Measures of Effectiveness

Jason M. Pittman, Kyle Hoffpauir, Nathan Markle

Honeypots are a deceptive technology used to capture malicious activity. The technology is useful for studying attacker behavior, tools, and techniques but can be difficult to implement and maintain. Historically, a lack of measures of effectiveness prevented researchers from assessing honeypot implementations. The consequence being ineffective implementations leading to poor performance, flawed imitation of legitimate services, and premature discovery by attackers. Previously, we developed a taxonomy for measures of effectiveness in dynamic honeypot implementations. The measures quantify a dynamic honeypot's effectiveness in fingerprinting its environment, capturing valid data from adversaries, deceiving adversaries, and intelligently monitoring itself and its surroundings. As a step towards developing automated effectiveness testing, this work introduces a tool for priming a target honeypot for evaluation. We outline the design of the tool and provide results in the form of quantitative calibration data.

CRMay 26, 2020
A Taxonomy for Dynamic Honeypot Measures of Effectiveness

Jason M. Pittman, Kyle Hoffpauir, Nathan Markle et al.

Honeypots are computing systems used to capture unauthorized, often malicious, activity. While honeypots can take on a variety of forms, researchers agree the technology is useful for studying adversary behavior, tools, and techniques. Unfortunately, researchers also agree honeypots are difficult to implement and maintain. A lack of measures of effectiveness compounds the implementation issues specifically. In other words, existing research does not provide a set of measures to determine if a honeypot is effective in its implementation. This is problematic because an ineffective implementation may lead to poor performance, inadequate emulation of legitimate services, or even premature discovery by an adversary. Accordingly, we have developed a taxonomy for measures of effectiveness in dynamic honeypot implementations. Our aim is for these measures to be used to quantify a dynamic honeypot's effectiveness in fingerprinting its environment, capturing valid data from adversaries, deceiving adversaries, and intelligently monitoring itself and its surroundings.

CRJan 14, 2020
Shades of Perception- User Factors in Identifying Password Strength

Jason M. Pittman, Nikki Robinson

The purpose of this study was to measure whether participant education, profession, and technical skill level exhibited a relationship with identification of password strength. Participants reviewed 50 passwords and labeled each as weak or strong. A Chi-square test of independence was used to measure relationships between education, profession, technical skill level relative to the frequency of weak and strong password identification. The results demonstrate significant relationships across all variable combinations except for technical skill and strong passwords which demonstrated no relationship. This research has three limitations. Data collection was dependent upon participant self-reporting and has limited externalized power. Further, the instrument was constructed under the assumption that all participants could read English and understood the concept of password strength. Finally, we did not control for external tool use (i.e., password strength meter). The results build upon existing literature insofar as the outcomes add to the collective understanding of user perception of passwords in specific and authentication in general. Whereas prior research has explored similar areas, such work has done so by having participants create passwords. This work measures perception of pre-generated passwords. The results demonstrate a need for further investigation into why users continue to rely on weak passwords. The originality of this work rests in soliciting a broad spectrum of participants and measuring potential correlations between participant education, profession, and technical skill level.

AINov 8, 2018
Stovepiping and Malicious Software: A Critical Review of AGI Containment

Jason M. Pittman, Jesus P. Espinoza, Courtney Crosby

Awareness of the possible impacts associated with artificial intelligence has risen in proportion to progress in the field. While there are tremendous benefits to society, many argue that there are just as many, if not more, concerns related to advanced forms of artificial intelligence. Accordingly, research into methods to develop artificial intelligence safely is increasingly important. In this paper, we provide an overview of one such safety paradigm: containment with a critical lens aimed toward generative adversarial networks and potentially malicious artificial intelligence. Additionally, we illuminate the potential for a developmental blindspot in the stovepiping of containment mechanisms.

AIJan 28, 2018
A Cyber Science Based Ontology for Artificial General Intelligence Containment

Jason M. Pittman, Courtney Crosby

The development of artificial general intelligence is considered by many to be inevitable. What such intelligence does after becoming aware is not so certain. To that end, research suggests that the likelihood of artificial general intelligence becoming hostile to humans is significant enough to warrant inquiry into methods to limit such potential. Thus, containment of artificial general intelligence is a timely and meaningful research topic. While there is limited research exploring possible containment strategies, such work is bounded by the underlying field the strategies draw upon. Accordingly, we set out to construct an ontology to describe necessary elements in any future containment technology. Using existing academic literature, we developed a single domain ontology containing five levels, 32 codes, and 32 associated descriptors. Further, we constructed ontology diagrams to demonstrate intended relationships. We then identified humans, AGI, and the cyber world as novel agent objects necessary for future containment activities. Collectively, the work addresses three critical gaps: (a) identifying and arranging fundamental constructs; (b) situating AGI containment within cyber science; and (c) developing scientific rigor within the field.