Lucas D. Lingle

LGSep 28, 2023Code

Transformer-VQ: Linear-Time Transformers via Vector Quantization

Lucas D. Lingle

We introduce Transformer-VQ, a decoder-only transformer computing softmax-based dense self-attention in linear time. Transformer-VQ's efficient attention is enabled by vector-quantized keys and a novel caching mechanism. In our large-scale experiments, Transformer-VQ is shown highly competitive in quality, obtaining 0.99 bpb on Enwik8, 26.6 ppl on PG-19, and 3.16 bpb on ImageNet64. In addition, the optimized implementation of Transformer-VQ is over 3x faster than a comparable quadratic-time transformer at sequence length 8k, is over 12x faster at 32k, and can scale to 131k with similar throughput. Code available: \url{https://github.com/transformer-vq/transformer_vq}

LGMar 3, 2021

Meta-Learning with Variational Bayes

Lucas D. Lingle

The field of meta-learning seeks to improve the ability of today's machine learning systems to adapt efficiently to small amounts of data. Typically this is accomplished by training a system with a parametrized update rule to improve a task-relevant objective based on supervision or a reward function. However, in many domains of practical interest, task data is unlabeled, or reward functions are unavailable. In this paper we introduce a new approach to address the more general problem of generative meta-learning, which we argue is an important prerequisite for obtaining human-level cognitive flexibility in artificial agents, and can benefit many practical applications along the way. Our contribution leverages the AEVB framework and mean-field variational Bayes, and creates fast-adapting latent-space generative models. At the heart of our contribution is a new result, showing that for a broad class of deep generative latent variable models, the relevant VB updates do not depend on any generative neural network. The theoretical merits of our approach are reflected in empirical experiments.

Lucas D. Lingle

2 Papers