61.0AIMay 9
DiagnosticIQ: A Benchmark for LLM-Based Industrial Maintenance Action Recommendation from Symbolic RulesDevin Yasith De Silva, Dhaval Patel, Christodoulos Constantinides et al.
Monitoring complex industrial assets relies on engineer-authored symbolic rules that trigger based on sensor conditions and prompt technicians to perform corrective actions. The bottleneck is not detection but response: translating rules into maintenance steps requires asset-specific knowledge gained through years of practice. We investigate whether LLMs can serve as decision support for this rule-to-action step and introduce \ours{}, a benchmark of 6{,}690 expert-validated multiple-choice questions from 118 rule-action pairs across 16 asset types. We contribute (i) a symbolic-to-MCQA pipeline normalizing rules to Disjunctive Normal Form with embedding-based distractor sampling, (ii) five variants probing distinct failure modes (Pro, Pert, Verbose, Aug, Rationale), and (iii) a benchmark of 29 LLMs and 4 embedding baselines. A human evaluation (9 practitioners, mean 45.0\%) confirms \ours{} requires specialist knowledge beyond operational experience. Three findings stand out. The frontier has closed: the top three LLMs lie within one Macro point, with Bradley-Terry Elo placing claude-opus-4-6 30 points above the next model. Yet \ours{}\,Pro exposes brittleness, with every model losing 13--60\% relative accuracy under distractor expansion. \ours{}\,Aug exposes pattern-matching: under condition inversion, frontier models still select the original answer 49--63\% of the time. The deployment bottleneck is not capability but calibration: frontier models handle template-style fault detection but break under structural perturbation.
7.6CRApr 24
A traffic analysis attack against Introduction Protocol and Onion ServicesNicolas Constantinides
Tor onion services rely on long-lived introduction circuits to support anonymous rendezvous between clients and services. Although Tor incorporates defenses against traffic analysis, the introduction protocol retains deterministic routing structure that can be exploited by an adversary. We present a practical intersection attack against Tor introduction circuits that over repeated interactions can identify each hop from the introduction point toward the onion service while requiring observation at only one relay per stage. The attack repeatedly probes the target service and intersects sets of destination IP addresses observed within narrowly bounded INTRODUCE1-RENDEZVOUS2 intervals, without assuming global visibility or access to packet payloads. Our traffic-analysis technique identifies with certainty the next relay in the path to target at each stage, thereby revealing a gap in Tor's privacy model, which is intended to resist traffic-analysis attacks in which an adversary uses traffic patterns to determine which points in the network to observe or attack. We evaluate the attack's feasibility through live-network experiments using a self-operated onion service and relays. To support data minimization, we implement a Tor-compatible plugin that computes intersections online over pseudonymized data retained only in volatile memory. Our experiments show reliable convergence in practice, with convergence rate influenced by relay consensus weight and time-varying background traffic. We further assess practicality under a partial-global adversary model and discuss the implications of geographic concentration in Tor relay selection weight across cooperating jurisdictions.