Damon Woodard

CR
5papers
25citations
Novelty52%
AI Score43

5 Papers

CLJun 21, 2022
TraSE: Towards Tackling Authorial Style from a Cognitive Science Perspective

Ronald Wilson, Avanti Bhandarkar, Damon Woodard

Stylistic analysis of text is a key task in research areas ranging from authorship attribution to forensic analysis and personality profiling. The existing approaches for stylistic analysis are plagued by issues like topic influence, lack of discriminability for large number of authors and the requirement for large amounts of diverse data. In this paper, the source of these issues are identified along with the necessity for a cognitive perspective on authorial style in addressing them. A novel feature representation, called Trajectory-based Style Estimation (TraSE), is introduced to support this purpose. Authorship attribution experiments with over 27,000 authors and 1.4 million samples in a cross-domain scenario resulted in 90% attribution accuracy suggesting that the feature representation is immune to such negative influences and an excellent candidate for stylistic analysis. Finally, a qualitative analysis is performed on TraSE using physical human characteristics, like age, to validate its claim on capturing cognitive traits.

CRJul 25, 2024
Is the Digital Forensics and Incident Response Pipeline Ready for Text-Based Threats in LLM Era?

Avanti Bhandarkar, Ronald Wilson, Anushka Swarup et al.

In the era of generative AI, the widespread adoption of Neural Text Generators (NTGs) presents new cybersecurity challenges, particularly within the realms of Digital Forensics and Incident Response (DFIR). These challenges primarily involve the detection and attribution of sources behind advanced attacks like spearphishing and disinformation campaigns. As NTGs evolve, the task of distinguishing between human and NTG-authored texts becomes critically complex. This paper rigorously evaluates the DFIR pipeline tailored for text-based security systems, specifically focusing on the challenges of detecting and attributing authorship of NTG-authored texts. By introducing a novel human-NTG co-authorship text attack, termed CS-ACT, our study uncovers significant vulnerabilities in traditional DFIR methodologies, highlighting discrepancies between ideal scenarios and real-world conditions. Utilizing 14 diverse datasets and 43 unique NTGs, up to the latest GPT-4, our research identifies substantial vulnerabilities in the forensic profiling phase, particularly in attributing authorship to NTGs. Our comprehensive evaluation points to factors such as model sophistication and the lack of distinctive style within NTGs as significant contributors for these vulnerabilities. Our findings underscore the necessity for more sophisticated and adaptable strategies, such as incorporating adversarial learning, stylizing NTGs, and implementing hierarchical attribution through the mapping of NTG lineages to enhance source attribution. This sets the stage for future research and the development of more resilient text-based security systems.

26.8CRApr 21
Potentials and Pitfalls of Applying Federated Learning in Hardware Assurance

Gijung Lee, Wavid Bowman, Olivia Dizon-Paradis et al.

As microelectronics flourish and outsourcing of the design and manufacturing stages of integrated circuits (ICs) and printed circuit boards (PCBs) becomes the norm, microelectronics stakeholders must also confront a new wave of security challenges, including the threats posed by hardware Trojans, counterfeit electronics, and reverse engineering attacks. Traditional detection and prevention methods like testing and side-channel analysis have limitations in reliability and scalability. Automated reverse engineering by deep learning (DL) models is a foolproof approach to hardware assurance, but faces challenges due to limited data. By pooling data from different stakeholders (competitors in industry, governments, etc.), DL models can be more effectively trained but privacy of intellectual property (IP) is a significant concern. Federated Learning (FL) has been proposed as a potential alternative allowing for the collaborative training of a DL model without sharing raw data. While FL has been widely used in healthcare, IoT, and finance, its application in hardware assurance remains underexplored. This study investigates, for the first time, FL-based DL for hardware assurance, demonstrating that FL outperforms single-client centralized learning in segmentation tasks for reverse engineering. Our results show that increasing the number of clients improves FL performance by collaboratively training the model with more data. However, and more importantly, a major pitfall of FL is also exposed -- it remains vulnerable to gradient inversion attacks. We show that SEM images used in FL can be recovered by attackers, which would therefore expose the sensitive and proprietary IPs that FL was supposed to protect. We highlight these privacy risks and also suggest future research directions to improve security and effectiveness in hardware assurance.

0.4ARMay 4
Resource Utilization of Differentiable Logic Gate Networks Deployed on FPGAs

Stephen Wormald, Gilon Kravatsky, Damon Woodard et al.

On-edge machine learning (ML) often strives to maximize the intelligence of small models while miniaturizing the circuit size and power needed to perform inference. Meeting these needs, differentiable Logic Gate Networks (LGN) have demonstrated nanosecond-scale prediction speeds while reducing the required resources as compares to traditional binary neural networks. Despite these benefits, the trade-offs between LGN parameters and resulting hardware synthesis characteristics are not well characterized. This paper therefore studies the tradeoffs between power, resource utilization, inference speed, and model accuracy when varying the depth and width of LGNs synthesized for Field Programmable Gate Arrays (FPGA). Results reveal that the final layer of an LGN is critical to minimize timing and resource usage (i.e. 28\% decrease), as this layer dictates the logic size of summing operations. Subject to timing and routing constraints, deeper and wider LGNs can be synthesized for FPGA when the final layer is narrow. Further tradeoffs are presented to help ML engineers select baseline LGN architectures for FPGAs with a set number of Look Up Tables (LUT).

CRMar 26, 2018
Secure and Reliable Biometric Access Control for Resource-Constrained Systems and IoT

Nima Karimian, Zimu Guo, Fatemeh Tehranipoor et al.

With the emergence of the Internet-of-Things (IoT), there is a growing need for access control and data protection on low-power, pervasive devices. Biometric-based authentication is promising for IoT due to its convenient nature and lower susceptibility to attacks. However, the costs associated with biometric processing and template protection are nontrivial for smart cards, key fobs, and so forth. In this paper, we discuss the security, cost, and utility of biometric systems and develop two major frameworks for improving them. First, we introduce a new framework for implementing biometric systems based on physical unclonable functions (PUFs) and hardware obfuscation that, unlike traditional software approaches, does not require nonvolatile storage of a biometric template/key. Aside from reducing the risk of compromising the biometric, the nature of obfuscation also provides protection against access control circumvention via malware and fault injection. The PUF provides non-invertibility and non-linkability. Second, a major requirement of the proposed PUF/obfuscation approach is that a reliable (robust) key be generated from the users input biometric. We propose a noiseaware biometric quantization framework capable of generating unique, reliable keys with reduced enrollment time and denoising costs. Finally, we conduct several case studies. In the first, the proposed noise-aware approach is compared to our previous approach for multiple biometric modalities, including popular ones (fingerprint and iris) and emerging cardiovascular ones (ECG and PPG). The results show that ECG provides the best tradeoff between reliability, key length, entropy, and cost. In the second and third case studies, we demonstrate how reliability, denoising costs, and enrollment times can be simultaneously improved by modeling subject intra-variations for ECG.