CLJul 15, 2022

Reasoning about Actions over Visual and Linguistic Modalities: A Survey

arXiv:2207.07568v112 citationsh-index: 30
Originality Synthesis-oriented
AI Analysis

It provides a comprehensive overview for researchers in NLP and computer vision, but is incremental as it synthesizes existing work without new results.

This paper surveys existing tasks, datasets, techniques, and models for reasoning about actions in vision and language domains, summarizing key takeaways, current challenges, and future directions.

'Actions' play a vital role in how humans interact with the world and enable them to achieve desired goals. As a result, most common sense (CS) knowledge for humans revolves around actions. While 'Reasoning about Actions & Change' (RAC) has been widely studied in the Knowledge Representation community, it has recently piqued the interest of NLP and computer vision researchers. This paper surveys existing tasks, benchmark datasets, various techniques and models, and their respective performance concerning advancements in RAC in the vision and language domain. Towards the end, we summarize our key takeaways, discuss the present challenges facing this research area, and outline potential directions for future research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes