Sensei: Self-Supervised Sensor Name Segmentation
This work solves the problem of automating sensor name segmentation for smart building application developers, reducing the manual annotation effort required for each building.
This paper addresses the challenge of automatically segmenting sensor names in smart buildings, which are often vendor-specific and require significant manual effort. The authors propose Sensei, a self-supervised framework that uses a neural language model to learn sensor naming structures and induce self-supervision for segmentation, outperforming baseline methods across five real-world buildings.
A sensor name, typically an alphanumeric string, encodes the key context (e.g., function and location) of a sensor needed for deploying smart building applications. Sensor names, however, are curated in a building vendor-specific manner using different structures and vocabularies that are often esoteric. They thus require tremendous manual effort to annotate on a per-building basis; even to just segment these sensor names into meaningful chunks. In this paper, we propose a fully automated self-supervised framework, Sensei, which can learn to segment sensor names without any human annotation. Specifically, we employ a neural language model to capture the underlying sensor naming structure and then induce self-supervision based on information from the language model to build the segmentation model. Extensive experiments on five real-world buildings comprising thousands of sensors demonstrate the superiority of Sensei over baseline methods.