NEMay 12, 2020

Deep Learning: Our Miraculous Year 1990-1991

arXiv:2005.05744v56 citations
Originality Incremental advance
AI Analysis

This work established core components of modern AI, impacting fields like generative AI and natural language processing.

The paper claims that foundational ideas for deep learning, including Generative Adversarial Networks, Transformers, and LSTM, were published in 1990-1991, leading to widespread adoption on billions of devices and high citation rates.

The Deep Learning Artificial Neural Networks (NNs) of our team have revolutionised Machine Learning & AI. Many of the basic ideas behind this revolution were published within the 12 months of our "Annus Mirabilis" 1990-1991 at our lab in TU Munich. Back then, few people were interested. But a quarter century later, NNs based on our "Miraculous Year" were on over 3 billion devices, and used many billions of times per day, consuming a significant fraction of the world's compute. In particular, in 1990-91, we laid foundations of Generative AI, publishing principles of (1) Generative Adversarial Networks for Artificial Curiosity and Creativity (now used for deepfakes), (2) Transformers (the T in ChatGPT - see the 1991 Unnormalized Linear Transformer), (3) Pre-training for deep NNs (see the P in ChatGPT), (4) NN distillation (key for DeepSeek), and (5) recurrent World Models for Reinforcement Learning and Planning in partially observable environments. The year 1991 also marks the emergence of the defining features of (6) LSTM, the most cited AI paper of the 20th century (based on deep residual learning and constant error flow through residual NN connections), and (7) the most cited paper of the 21st century, based on our LSTM-inspired Highway Net that was 10 times deeper than previous feedforward NNs. As of 2025, the two most frequently cited scientific articles of all time (with the most Google Scholar citations within 3 years - manuals excluded) are both directly based on our 1991 work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes