AICLLGApr 14, 2022

Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness

arXiv:2205.00001v37 citationsh-index: 41
Originality Incremental advance
AI Analysis

This work aims to advance machine models of intelligence and consciousness by formalizing a multimodal inner language, though it appears incremental as it builds upon existing models like the Conscious Turing Machine.

The paper tackles the problem of developing a multimodal language called Brainish to enhance machine intelligence and consciousness, proposing a framework for learning it and demonstrating its capability on multimodal prediction and retrieval tasks across real-world datasets.

Having a rich multimodal inner language is an important component of human intelligence that enables several necessary core cognitive functions such as multimodal prediction, translation, and generation. Building upon the Conscious Turing Machine (CTM), a machine model for consciousness proposed by Blum and Blum (2021), we describe the desiderata of a multimodal language called Brainish, comprising words, images, audio, and sensations combined in representations that the CTM's processors use to communicate with each other. We define the syntax and semantics of Brainish before operationalizing this language through the lens of multimodal artificial intelligence, a vibrant research area studying the computational tools necessary for processing and relating information from heterogeneous signals. Our general framework for learning Brainish involves designing (1) unimodal encoders to segment and represent unimodal data, (2) a coordinated representation space that relates and composes unimodal features to derive holistic meaning across multimodal inputs, and (3) decoders to map multimodal representations into predictions (for fusion) or raw data (for translation or generation). Through discussing how Brainish is crucial for communication and coordination in order to achieve consciousness in the CTM, and by implementing a simple version of Brainish and evaluating its capability of demonstrating intelligence on multimodal prediction and retrieval tasks on several real-world image, text, and audio datasets, we argue that such an inner language will be important for advances in machine models of intelligence and consciousness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes