ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition
This work addresses the problem of advancing zero-resource speech research by adding a visually-grounded language modeling track, but it is incremental as it builds on an existing challenge framework.
The paper introduced a new track for visually-grounded language modeling in the Zero-Resource Speech Challenge 2021, detailing its motivation and participation rules, and presented two baseline systems for it.
We present the visually-grounded language modelling track that was introduced in the Zero-Resource Speech challenge, 2021 edition, 2nd round. We motivate the new track and discuss participation rules in detail. We also present the two baseline systems that were developed for this track.