Personal Privacy Protection via Irrelevant Faces Tracking and Pixelation in Video Live Streaming
This work tackles the problem of real-time, automatic face pixelation for privacy protection in video live streaming, which is a significant challenge for content creators and platforms.
This paper addresses the labor-intensive problem of privacy-protecting pixelation in video live streaming by developing FPVLS, a method for automatic personal privacy filtering. FPVLS uses a two-stage frame-to-video structure, employing face detection and embedding for face vectors, followed by a Positioned Incremental Affinity Propagation (PIAP) clustering algorithm for raw trajectory generation, and a trajectory refinement stage using a proposal network with an Empirical Likelihood Ratio (ELR) statistic. The method achieves satisfying accuracy and real-time efficiency on a collected video live streaming dataset, while also containing over-pixelation problems.
To date, the privacy-protection intended pixelation tasks are still labor-intensive and yet to be studied. With the prevailing of video live streaming, establishing an online face pixelation mechanism during streaming is an urgency. In this paper, we develop a new method called Face Pixelation in Video Live Streaming (FPVLS) to generate automatic personal privacy filtering during unconstrained streaming activities. Simply applying multi-face trackers will encounter problems in target drifting, computing efficiency, and over-pixelation. Therefore, for fast and accurate pixelation of irrelevant people's faces, FPVLS is organized in a frame-to-video structure of two core stages. On individual frames, FPVLS utilizes image-based face detection and embedding networks to yield face vectors. In the raw trajectories generation stage, the proposed Positioned Incremental Affinity Propagation (PIAP) clustering algorithm leverages face vectors and positioned information to quickly associate the same person's faces across frames. Such frame-wise accumulated raw trajectories are likely to be intermittent and unreliable on video level. Hence, we further introduce the trajectory refinement stage that merges a proposal network with the two-sample test based on the Empirical Likelihood Ratio (ELR) statistic to refine the raw trajectories. A Gaussian filter is laid on the refined trajectories for final pixelation. On the video live streaming dataset we collected, FPVLS obtains satisfying accuracy, real-time efficiency, and contains the over-pixelation problems.