CVApr 8, 2023

Word-level Persian Lipreading Dataset

arXiv:2304.04068v17 citationsh-index: 18
Originality Synthesis-oriented
AI Analysis

This provides a domain-specific resource for Persian lipreading, but it is incremental as it applies existing methods to new data.

The authors tackled the lack of a suitable dataset for Persian lipreading by creating a new in-the-wild dataset with 244,000 videos from 1,800 speakers, and they achieved significantly better performance using the AV-HuBERT model for feature extraction.

Lip-reading has made impressive progress in recent years, driven by advances in deep learning. Nonetheless, the prerequisite such advances is a suitable dataset. This paper provides a new in-the-wild dataset for Persian word-level lipreading containing 244,000 videos from approximately 1,800 speakers. We evaluated the state-of-the-art method in this field and used a novel approach for word-level lip-reading. In this method, we used the AV-HuBERT model for feature extraction and obtained significantly better performance on our dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes