LG SYMar 10, 2022

Conditional Synthetic Data Generation for Personal Thermal Comfort Models

Berkeley

arXiv:2203.05242v23.33 citationsh-index: 56

Originality Synthesis-oriented

AI Analysis

This addresses data scarcity and privacy issues for building energy management systems, but is incremental as it applies an existing method to a specific domain.

The paper tackles the problem of class imbalance in personal thermal comfort data, where 'Prefer No Change' samples dominate, by proposing a conditional synthetic data generator to create synthetic data for low-frequency classes, showing that the generated data mimics the real distribution.

Personal thermal comfort models aim to predict an individual's thermal comfort response, instead of the average response of a large group. Recently, machine learning algorithms have proven to be having enormous potential as a candidate for personal thermal comfort models. But, often within the normal settings of a building, personal thermal comfort data obtained via experiments are heavily class-imbalanced. There are a disproportionately high number of data samples for the "Prefer No Change" class, as compared with the "Prefer Warmer" and "Prefer Cooler" classes. Machine learning algorithms trained on such class-imbalanced data perform sub-optimally when deployed in the real world. To develop robust machine learning-based applications using the above class-imbalanced data, as well as for privacy-preserving data sharing, we propose to implement a state-of-the-art conditional synthetic data generator to generate synthetic data corresponding to the low-frequency classes. Via experiments, we show that the synthetic data generated has a distribution that mimics the real data distribution. The proposed method can be extended for use by other smart building datasets/use-cases.

View on arXiv PDF

Similar