CYJan 22, 2025
PADTHAI-MM: Principles-based Approach for Designing Trustworthy, Human-centered AI using MAST MethodologyMyke C. Cohen, Nayoung Kim, Yang Ba et al.
Despite an extensive body of literature on trust in technology, designing trustworthy AI systems for high-stakes decision domains remains a significant challenge, further compounded by the lack of actionable design and evaluation tools. The Multisource AI Scorecard Table (MAST) was designed to bridge this gap by offering a systematic, tradecraft-centered approach to evaluating AI-enabled decision support systems. Expanding on MAST, we introduce an iterative design framework called \textit{Principles-based Approach for Designing Trustworthy, Human-centered AI using MAST Methodology} (PADTHAI-MM). We demonstrate this framework in our development of the Reporting Assistant for Defense and Intelligence Tasks (READIT), a research platform that leverages data visualizations and natural language processing-based text analysis, emulating an AI-enabled system supporting intelligence reporting work. To empirically assess the efficacy of MAST on trust in AI, we developed two distinct iterations of READIT for comparison: a High-MAST version, which incorporates AI contextual information and explanations, and a Low-MAST version, akin to a ``black box'' system. This iterative design process, guided by stakeholder feedback and contemporary AI architectures, culminated in a prototype that was evaluated through its use in an intelligence reporting task. We further discuss the potential benefits of employing the MAST-inspired design framework to address context-specific needs. We also explore the relationship between stakeholder evaluators' MAST ratings and three categories of information known to impact trust: \textit{process}, \textit{purpose}, and \textit{performance}. Overall, our study supports the practical benefits and theoretical validity for PADTHAI-MM as a viable method for designing trustable, context-specific AI systems.
AIFeb 8, 2021
Multisource AI Scorecard Table for System EvaluationErik Blasch, James Sung, Tao Nguyen
The paper describes a Multisource AI Scorecard Table (MAST) that provides the developer and user of an artificial intelligence (AI)/machine learning (ML) system with a standard checklist focused on the principles of good analysis adopted by the intelligence community (IC) to help promote the development of more understandable systems and engender trust in AI outputs. Such a scorecard enables a transparent, consistent, and meaningful understanding of AI tools applied for commercial and government use. A standard is built on compliance and agreement through policy, which requires buy-in from the stakeholders. While consistency for testing might only exist across a standard data set, the community requires discussion on verification and validation approaches which can lead to interpretability, explainability, and proper use. The paper explores how the analytic tradecraft standards outlined in Intelligence Community Directive (ICD) 203 can provide a framework for assessing the performance of an AI system supporting various operational needs. These include sourcing, uncertainty, consistency, accuracy, and visualization. Three use cases are presented as notional examples that support security for comparative analysis.
CYNov 3, 2019
Artificial Intelligence Strategies for National Security and Safety StandardsErik Blasch, James Sung, Tao Nguyen et al.
Recent advances in artificial intelligence (AI) have lead to an explosion of multimedia applications (e.g., computer vision (CV) and natural language processing (NLP)) for different domains such as commercial, industrial, and intelligence. In particular, the use of AI applications in a national security environment is often problematic because the opaque nature of the systems leads to an inability for a human to understand how the results came about. A reliance on 'black boxes' to generate predictions and inform decisions is potentially disastrous. This paper explores how the application of standards during each stage of the development of an AI system deployed and used in a national security environment would help enable trust. Specifically, we focus on the standards outlined in Intelligence Community Directive 203 (Analytic Standards) to subject machine outputs to the same rigorous standards as analysis performed by humans.