ML LG APJun 5, 2023

Using Sequences of Life-events to Predict Human Lives

Germans Savcisens, Tina Eliassi-Rad, Lars Kai Hansen, Laust Mortensen, Lau Lilleholt, Anna Rogers, Ingo Zettler, Sune Lehmann

arXiv:2306.03009v123.593 citationsh-index: 44Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of predicting complex life outcomes for individuals, potentially enabling personalized interventions, though it builds incrementally on existing NLP methods applied to new data.

The researchers tackled the problem of predicting human life outcomes by modeling lives as sequences of events, using comprehensive registry data for over six million individuals. They adapted transformer-based architectures from natural language processing to predict diverse outcomes like early mortality and personality nuances, outperforming state-of-the-art models by a wide margin.

Over the past decade, machine learning has revolutionized computers' ability to analyze text through flexible computational models. Due to their structural similarity to written language, transformer-based architectures have also shown promise as tools to make sense of a range of multi-variate sequences from protein-structures, music, electronic health records to weather-forecasts. We can also represent human lives in a way that shares this structural similarity to language. From one perspective, lives are simply sequences of events: People are born, visit the pediatrician, start school, move to a new location, get married, and so on. Here, we exploit this similarity to adapt innovations from natural language processing to examine the evolution and predictability of human lives based on detailed event sequences. We do this by drawing on arguably the most comprehensive registry data in existence, available for an entire nation of more than six million individuals across decades. Our data include information about life-events related to health, education, occupation, income, address, and working hours, recorded with day-to-day resolution. We create embeddings of life-events in a single vector space showing that this embedding space is robust and highly structured. Our models allow us to predict diverse outcomes ranging from early mortality to personality nuances, outperforming state-of-the-art models by a wide margin. Using methods for interpreting deep learning models, we probe the algorithm to understand the factors that enable our predictions. Our framework allows researchers to identify new potential mechanisms that impact life outcomes and associated possibilities for personalized interventions.

View on arXiv PDF Code

Similar