SensorQA: A Question Answering Benchmark for Daily-Life Monitoring
This addresses the lack of datasets for human-interpretable sensor data analysis, though it is incremental as it builds on existing QA and sensor monitoring research.
The authors tackled the problem of enabling end users to extract insights from sensor data by introducing SensorQA, the first human-created question-answering dataset for daily-life monitoring, which includes 5.6K queries and benchmarks show a performance gap in current AI models.
With the rapid growth in sensor data, effectively interpreting and interfacing with these data in a human-understandable way has become crucial. While existing research primarily focuses on learning classification models, fewer studies have explored how end users can actively extract useful insights from sensor data, often hindered by the lack of a proper dataset. To address this gap, we introduce SensorQA, the first human-created question-answering (QA) dataset for long-term time-series sensor data for daily life monitoring. SensorQA is created by human workers and includes 5.6K diverse and practical queries that reflect genuine human interests, paired with accurate answers derived from sensor data. We further establish benchmarks for state-of-the-art AI models on this dataset and evaluate their performance on typical edge devices. Our results reveal a gap between current models and optimal QA performance and efficiency, highlighting the need for new contributions. The dataset and code are available at: https://github.com/benjamin-reichman/SensorQA.