CL LGAug 22, 2024

Jamba-1.5: Hybrid Transformer-Mamba Models at Scale

Jamba Team, Barak Lenz, Alan Arazi, Amir Bergman, Avshalom Manevich, Barak Peleg, Ben Aviram, Chen Almagor, Clara Fridman, Dan Padnos, Daniel Gissin, Daniel Jannai

arXiv:2408.12570v118.552 citationsh-index: 72Has Code

Originality Incremental advance

AI Analysis

This work addresses efficiency and scalability challenges for deploying large language models in long-context applications, though it is incremental as it builds on existing hybrid architectures.

The authors tackled the problem of scaling large language models for long-context tasks by introducing Jamba-1.5, a hybrid Transformer-Mamba architecture with up to 256K token context length, achieving high throughput and competitive performance on benchmarks while enabling cost-effective inference on 8 GPUs.

We present Jamba-1.5, new instruction-tuned large language models based on our Jamba architecture. Jamba is a hybrid Transformer-Mamba mixture of experts architecture, providing high throughput and low memory usage across context lengths, while retaining the same or better quality as Transformer models. We release two model sizes: Jamba-1.5-Large, with 94B active parameters, and Jamba-1.5-Mini, with 12B active parameters. Both models are fine-tuned for a variety of conversational and instruction-following capabilties, and have an effective context length of 256K tokens, the largest amongst open-weight models. To support cost-effective inference, we introduce ExpertsInt8, a novel quantization technique that allows fitting Jamba-1.5-Large on a machine with 8 80GB GPUs when processing 256K-token contexts without loss of quality. When evaluated on a battery of academic and chatbot benchmarks, Jamba-1.5 models achieve excellent results while providing high throughput and outperforming other open-weight models on long-context benchmarks. The model weights for both sizes are publicly available under the Jamba Open Model License and we release ExpertsInt8 as open source.

View on arXiv PDF

Similar