AIOct 27, 2021

Towards artificial general intelligence via a multimodal foundation model

arXiv:2110.14378v2316 citations
Originality Incremental advance
AI Analysis

This work aims to advance artificial general intelligence (AGI) by enabling a model to handle multiple cognitive tasks, though it appears incremental as it builds on existing foundation model concepts.

The authors tackled the limitation of single-cognitive ability in AI by developing a multimodal foundation model pre-trained with self-supervised learning on weak semantic correlation data, achieving promising results on various downstream tasks and demonstrating strong imagination ability.

The fundamental goal of artificial intelligence (AI) is to mimic the core cognitive activities of human. Despite tremendous success in the AI research, most of existing methods have only single-cognitive ability. To overcome this limitation and take a solid step towards artificial general intelligence (AGI), we develop a foundation model pre-trained with huge multimodal data, which can be quickly adapted for various downstream cognitive tasks. To achieve this goal, we propose to pre-train our foundation model by self-supervised learning with weak semantic correlation data crawled from the Internet and show that promising results can be obtained on a wide range of downstream tasks. Particularly, with the developed model-interpretability tools, we demonstrate that strong imagination ability is now possessed by our foundation model. We believe that our work makes a transformative stride towards AGI, from our common practice of "weak or narrow AI" to that of "strong or generalized AI".

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes