AIFeb 15, 2024

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Quentin Gallouédec, Edward Beeching, Clément Romac, Emmanuel Dellandréa

arXiv:2402.09844v319.619 citationsh-index: 7Has Code

Originality Incremental advance

AI Analysis

This addresses the limitation of single-task models in reinforcement learning, offering a step towards versatile, cross-domain AI, though it appears incremental as it builds on existing transformer architectures.

The paper tackles the problem of developing a general model that operates across multiple domains by introducing Jack of All Trades (JAT), a transformer-based model that achieves strong performance on diverse reinforcement learning benchmarks and promising results on computer vision and natural language processing tasks using a single set of weights.

The search for a general model that can operate seamlessly across multiple domains remains a key goal in machine learning research. The prevailing methodology in Reinforcement Learning (RL) typically limits models to a single task within a unimodal framework, a limitation that contrasts with the broader vision of a versatile, multi-domain model. In this paper, we present Jack of All Trades (JAT), a transformer-based model with a unique design optimized for handling sequential decision-making tasks and multi-modal data types. The JAT model demonstrates its robust capabilities and versatility by achieving strong performance on very different RL benchmarks, along with promising results on Computer Vision (CV) and Natural Language Processing (NLP) tasks, all using a single set of weights. The JAT model marks a significant step towards more general, cross-domain AI model design, and notably, it is the first model of its kind to be fully open-sourced at https://huggingface.co/jat-project/jat, including a pioneering general-purpose dataset.

View on arXiv PDF Code

Similar