LGJan 26

Explainability Methods for Hardware Trojan Detection: A Systematic Comparison

arXiv:2601.18696v1h-index: 2
Originality Synthesis-oriented
AI Analysis

It addresses the need for interpretable explanations in hardware security for engineers, though it is incremental as it compares existing methods rather than introducing a new paradigm.

This work systematically compares three explainability methods for hardware trojan detection on the Trust-Hub benchmark, finding that property-based and case-based approaches offer domain-aligned interpretability, while achieving a 9-fold precision improvement (46.15% vs. 5.13%) over prior work.

Hardware trojan detection requires accurate identification and interpretable explanations for security engineers to validate and act on results. This work compares three explainability categories for gate-level trojan detection on the Trust-Hub benchmark: (1) domain-aware property-based analysis of 31 circuit-specific features from gate fanin patterns, flip-flop distances, and I/O connectivity; (2) case-based reasoning using k-nearest neighbors for precedent-based explanations; and (3) model-agnostic feature attribution (LIME, SHAP, gradient). Results show different advantages per approach. Property-based analysis provides explanations through circuit concepts like "high fanin complexity near outputs indicates potential triggers." Case-based reasoning achieves 97.4% correspondence between predictions and training exemplars, offering justifications grounded in precedent. LIME and SHAP provide feature attributions with strong inter-method correlation (r=0.94, p<0.001) but lack circuit-level context for validation. XGBoost classification achieves 46.15% precision and 52.17% recall on 11,392 test samples, a 9-fold precision improvement over prior work (Hasegawa et al.: 5.13%) while reducing false positive rates from 5.6% to 0.25%. Gradient-based attribution runs 481 times faster than SHAP but provides similar domain-opaque insights. This work demonstrates that property-based and case-based approaches offer domain alignment and precedent-based interpretability compared to generic feature rankings, with implications for XAI deployment where practitioners must validate ML predictions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes