AIFeb 2

DomusFM: A Foundation Model for Smart-Home Sensor Data

arXiv:2602.01910v1h-index: 17
Originality Incremental advance
AI Analysis

This addresses data scarcity and deployability issues for real-world smart-home systems, offering a practical solution for applications like healthcare monitoring and assistive technologies, though it is incremental as it builds on existing foundation model concepts.

The paper tackles the problem of smart-home sensor data analysis by introducing DomusFM, a foundation model that uses self-supervised dual contrastive learning to learn generalizable representations, achieving superior performance on downstream tasks with only 5% labeled data in evaluations across seven datasets.

Smart-home sensor data holds significant potential for several applications, including healthcare monitoring and assistive technologies. Existing approaches, however, face critical limitations. Supervised models require impractical amounts of labeled data. Foundation models for activity recognition focus only on inertial sensors, failing to address the unique characteristics of smart-home binary sensor events: their sparse, discrete nature combined with rich semantic associations. LLM-based approaches, while tested in this domain, still raise several issues regarding the need for natural language descriptions or prompting, and reliance on either external services or expensive hardware, making them infeasible in real-life scenarios due to privacy and cost concerns. We introduce DomusFM, the first foundation model specifically designed and pretrained for smart-home sensor data. DomusFM employs a self-supervised dual contrastive learning paradigm to capture both token-level semantic attributes and sequence-level temporal dependencies. By integrating semantic embeddings from a lightweight language model and specialized encoders for temporal patterns and binary states, DomusFM learns generalizable representations that transfer across environments and tasks related to activity and event analysis. Through leave-one-dataset-out evaluation across seven public smart-home datasets, we demonstrate that DomusFM outperforms state-of-the-art baselines on different downstream tasks, achieving superior performance even with only 5% of labeled training data available for fine-tuning. Our approach addresses data scarcity while maintaining practical deployability for real-world smart-home systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes