LG AI HCOct 30, 2024

SpiroActive: Active Learning for Efficient Data Acquisition for Spirometry

Ankita Kumari Jain, Nitish Sharma, Madhav Kanda, Nipun Batra

arXiv:2410.22950v12.62 citationsh-index: 18

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of efficient data acquisition for wearable spirometry, which could improve early diagnosis of respiratory illnesses like COPD, though it is incremental as it applies an existing active learning method to a specific domain.

The paper tackles the problem of high costs and limited accessibility in collecting ground truth spirometry data for wearable respiratory monitoring by proposing an active learning approach to strategically select samples, achieving comparable or better model performance with small subsets compared to using the full dataset.

Respiratory illnesses are a significant global health burden. Respiratory illnesses, primarily Chronic obstructive pulmonary disease (COPD), is the seventh leading cause of poor health worldwide and the third leading cause of death worldwide, causing 3.23 million deaths in 2019, necessitating early identification and diagnosis for effective mitigation. Among the diagnostic tools employed, spirometry plays a crucial role in detecting respiratory abnormalities. However, conventional clinical spirometry methods often entail considerable costs and practical limitations like the need for specialized equipment, trained personnel, and a dedicated clinical setting, making them less accessible. To address these challenges, wearable spirometry technologies have emerged as promising alternatives, offering accurate, cost-effective, and convenient solutions. The development of machine learning models for wearable spirometry heavily relies on the availability of high-quality ground truth spirometry data, which is a laborious and expensive endeavor. In this research, we propose using active learning, a sub-field of machine learning, to mitigate the challenges associated with data collection and labeling. By strategically selecting samples from the ground truth spirometer, we can mitigate the need for resource-intensive data collection. We present evidence that models trained on small subsets obtained through active learning achieve comparable/better results than models trained on the complete dataset.

View on arXiv PDF

Similar