Uncovering Bias in Personal Informatics
This addresses bias issues in personal informatics systems used by billions for health monitoring, which has practical and ethical implications, though it is incremental as it builds on existing bias research in a specific domain.
The study conducted the first comprehensive empirical and analytical investigation of bias in personal informatics systems, finding that biases exist in data generation and machine learning processes, with minority groups like users with health issues and females being most affected, including intersectional biases.
Personal informatics (PI) systems, powered by smartphones and wearables, enable people to lead healthier lifestyles by providing meaningful and actionable insights that break down barriers between users and their health information. Today, such systems are used by billions of users for monitoring not only physical activity and sleep but also vital signs and women's and heart health, among others. Despite their widespread usage, the processing of sensitive PI data may suffer from biases, which may entail practical and ethical implications. In this work, we present the first comprehensive empirical and analytical study of bias in PI systems, including biases in raw data and in the entire machine learning life cycle. We use the most detailed framework to date for exploring the different sources of bias and find that biases exist both in the data generation and the model learning and implementation streams. According to our results, the most affected minority groups are users with health issues, such as diabetes, joint issues, and hypertension, and female users, whose data biases are propagated or even amplified by learning models, while intersectional biases can also be observed.