HCMay 25
The Timing Dependencies of Trust: Speed, Accuracy, and cBCI Neuro-Decoupling in Human-AI TeamsChristopher Baker, Stephen Hinton, Akashdeep Nijjar et al.
The speed and accuracy of an artificial teammate fundamentally alter the failure states of Human-AI integration. While high-speed AI interventions risk inducing reflexive blind compliance, delayed interventions can induce ambiguous cognitive conflict. This study investigates how the fundamental characteristics of an in-task AI assistant, Fast/Less-Accurate (FLA-AI) versus Slow/Accurate (SA-AI) impact the synergy of Collaborative Brain-Computer Interface (cBCI) teams in a Virtual Reality drone task. Seventeen operators completed continuous search tasks under high cognitive workload while their spatial covariance was mapped using a 2D Adaptive Riemannian Oracle. The results mathematically demonstrate that AI timing dictates the mechanism of team failure. Fast AI induced instant, blind compliance; human accuracy under deception collapsed to 50.2%, and pure behavioural teams (N=8) failed to scale beyond 74.1%. In contrast, Slow AI induced delayed cognitive conflict; humans hesitated (61.1% accuracy), but N=8 behavioural teams eventually recovered to 100.0%. Crucially, the Riemannian Oracle mathematically adapted to these states: it heavily restricted temporal windows (< 0.8s) to intercept fast reflexive compliance, while widening windows (> 1.2s) to capture delayed cognitive conflict. Integrating these isolated veridical signals via Hybrid Fusion successfully rescued the Fast AI team (+7.6% at N=8) and significantly accelerated the recovery of smaller Slow AI teams (+6.9% at N=4). These findings prove that cBCI synergy is heavily contingent on the temporal dynamics of trust, providing a critical framework for designing dynamically gated Human-AI systems.
CYAug 28, 2024
Verification methods for international AI agreementsAkash R. Wasil, Tom Reed, Jack William Miller et al.
What techniques can be used to verify compliance with international agreements about advanced AI development? In this paper, we examine 10 verification methods that could detect two types of potential violations: unauthorized AI training (e.g., training runs above a certain FLOP threshold) and unauthorized data centers. We divide the verification methods into three categories: (a) national technical means (methods requiring minimal or no access from suspected non-compliant nations), (b) access-dependent methods (methods that require approval from the nation suspected of unauthorized activities), and (c) hardware-dependent methods (methods that require rules around advanced hardware). For each verification method, we provide a description, historical precedents, and possible evasion techniques. We conclude by offering recommendations for future work related to the verification and enforcement of international AI governance agreements.
CYSep 4, 2024
Governing dual-use technologies: Case studies of international security agreements and lessons for AI governanceAkash R. Wasil, Peter Barnett, Michael Gerovitch et al.
International AI governance agreements and institutions may play an important role in reducing global security risks from advanced AI. To inform the design of such agreements and institutions, we conducted case studies of historical and contemporary international security agreements. We focused specifically on those arrangements around dual-use technologies, examining agreements in nuclear security, chemical weapons, biosecurity, and export controls. For each agreement, we examined four key areas: (a) purpose, (b) core powers, (c) governance structure, and (d) instances of non-compliance. From these case studies, we extracted lessons for the design of international AI agreements and governance institutions. We discuss the importance of robust verification methods, strategies for balancing power between nations, mechanisms for adapting to rapid technological change, approaches to managing trade-offs between transparency and security, incentives for participation, and effective enforcement mechanisms.
CYAug 13, 2025
STREAM (ChemBio): A Standard for Transparently Reporting Evaluations in AI Model ReportsTegan McCaslin, Jide Alaga, Samira Nedungadi et al.
Evaluations of dangerous AI capabilities are important for managing catastrophic risks. Public transparency into these evaluations - including what they test, how they are conducted, and how their results inform decisions - is crucial for building trust in AI development. We propose STREAM (A Standard for Transparently Reporting Evaluations in AI Model Reports), a standard to improve how model reports disclose evaluation results, initially focusing on chemical and biological (ChemBio) benchmarks. Developed in consultation with 23 experts across government, civil society, academia, and frontier AI companies, this standard is designed to (1) be a practical resource to help AI developers present evaluation results more clearly, and (2) help third parties identify whether model reports provide sufficient detail to assess the rigor of the ChemBio evaluations. We concretely demonstrate our proposed best practices with "gold standard" examples, and also provide a three-page reporting template to enable AI developers to implement our recommendations more easily.