VRM-Phase I VKW system description of long-short video customizable keyword wakeup challenge
This addresses the need for standardized evaluation of keyword wakeup technology in speech processing for video applications, but it is incremental as it focuses on organizing a challenge rather than introducing new methods.
The paper describes the Video Keyword Wakeup Challenge (VKW), which tested teams' ability to build keyword wakeup systems on a public dataset for Chinese long-short videos, requiring support for multiple and customizable keywords.
Keyword wakeup technology has always been a research hotspot in speech processing, but many related works were done on different datasets. We organized a Chinese long-short video keyword wakeup challenge (Video Keyword Wakeup Challenge, VKW) for testing the ability of each participating team to build a keyword wakeup system under the public dataset. All submitted systems not only need to support the setting of multiple different keywords, but also need to support the wakeup of any costumed keyword.This paper mainly describes the basic situation of the VKW challenge and the experimental results of some participating teams.