Privacy-Preserving Smart Surveillance with Cross-Dataset Violence Detection and Decentralized Evidence Governance
For surveillance system designers, this work addresses the tension between automated incident detection and privacy-preserving evidence governance, but the detection component is incremental.
This paper proposes a privacy-preserving surveillance framework that separates violence detection from evidence disclosure using a lightweight MobileNetV2-based classifier and threshold-based encryption. The best model (MobileNetV2+BiLSTM) achieves 93.5% test accuracy and 0.980 ROC-AUC on merged held-out data, though cross-dataset performance degrades due to dataset shift.
AI-enabled surveillance can accelerate public-safety response, yet most systems still leave recorded evidence under centralized administrative control. This paper proposes a privacy-preserving smart surveillance framework that separates incident detection from evidence disclosure. A lightweight MobileNetV2-based video classifier detects violent clips, while each recorded incident segment is immediately encrypted and made accessible only through threshold-based approval. The decryption key is split with Shamir's Secret Sharing, member shares are protected with public-key cryptography, and voting is supported by time-limited tokens, two-factor authentication, signatures, and audit logs. This study evaluates MobileNetV2+LSTM, MobileNetV2+BiLSTM, and MobileNetV2+temporal CNN heads on SCVD, RWF-2000, and Real-Life Violence Situations under seven in-domain and cross-dataset scenarios. The best all-source model, MobileNetV2+BiLSTM, reaches 93.5% test accuracy and ROC-AUC 0.980% on the merged held-out set, while lower RWF-2000 slice performance confirms persistent dataset shift.