CYApr 21
Catalyzing Informed Residential Energy Retrofit Decisions via Domain-Specific LLMLei Shu, Dong Zhao, Jianli Chen et al.
Residential energy retrofit initiation is often stalled by an expertise gap, where homeowners lack the technical literacy required for structured building energy assessments and are thereby trapped in low-information environments with fragmented sources. To bridge this gap, this study reports a domain-specific large language model (LLM) designed to catalyze informed decision-making based solely on homeowner-accessible, natural-language descriptions, e.g., building age, size, and location. The model is created using the parameter-efficient low-rank adaption (LoRA) fine-tuning approach on a massive corpus grounded in physics-based energy simulations and techno-economic calculations from 536,416 U.S. residential building prototypes. Nine major retrofit categories are evaluated, including envelope upgrades, HVAC systems, and renewable energy installations. Validations against physics-grounded benchmarks show that the LLM consistently identifies high-quality retrofit options, achieving top-3 hit rates of 98.9% for maximum CO2 reduction and 93.3% for the shortest discounted payback year. Moreover, the model exhibits strong robustness under incomplete input conditions, maintaining stable performance even when basic dwelling descriptions are only 60% partially specified. By significantly lowering the information activation energy for non-expert users while maintaining the scientific rigor, this physics-based AI model offers a scalable pathway for parallelized, user-centered decision making, accelerating cumulative energy savings and emission reductions across community and national scales.
CRJan 7, 2020Code
Effective Scaling of Blockchain Beyond Consensus Innovations and Moore's LawYinqiu Liu, Kai Qian, Jianli Chen et al.
As an emerging technology, blockchain has achieved great success in numerous application scenarios, from intelligent healthcare to smart cities. However, a long-standing bottleneck hindering its further development is the massive resource consumption attributed to the distributed storage and computation methods. This makes blockchain suffer from insufficient performance and poor scalability. Here, we analyze the recent blockchain techniques and demonstrate that the potential of widely-adopted consensus-based scaling is seriously limited, especially in the current era when Moore's law-based hardware scaling is about to end. We achieve this by developing an open-source benchmarking tool, called Prism, for investigating the key factors causing low resource efficiency and then discuss various topology and hardware innovations which could help to scale up blockchain. To the best of our knowledge, this is the first in-depth study that explores the next-generation scaling strategies by conducting large-scale and comprehensive benchmarking.
CLFeb 21, 2025
Constructing a Norm for Children's Scientific Drawing: Distribution Features Based on Semantic Similarity of Large Language ModelsYi Zhang, Fan Wei, Jingyi Li et al.
The use of children's drawings to examining their conceptual understanding has been proven to be an effective method, but there are two major problems with previous research: 1. The content of the drawings heavily relies on the task, and the ecological validity of the conclusions is low; 2. The interpretation of drawings relies too much on the subjective feelings of the researchers. To address this issue, this study uses the Large Language Model (LLM) to identify 1420 children's scientific drawings (covering 9 scientific themes/concepts), and uses the word2vec algorithm to calculate their semantic similarity. The study explores whether there are consistent drawing representations for children on the same theme, and attempts to establish a norm for children's scientific drawings, providing a baseline reference for follow-up children's drawing research. The results show that the representation of most drawings has consistency, manifested as most semantic similarity>0.8. At the same time, it was found that the consistency of the representation is independent of the accuracy (of LLM's recognition), indicating the existence of consistency bias. In the subsequent exploration of influencing factors, we used Kendall rank correlation coefficient to investigate the effects of "sample size", "abstract degree", and "focus points" on drawings, and used word frequency statistics to explore whether children represented abstract themes/concepts by reproducing what was taught in class. It was found that accuracy (of LLM's recognition) is the most sensitive indicator, and data such as sample size and semantic similarity are related to it; The consistency between classroom experiments and teaching purpose is also an important factor, many students focus more on the experiments themselves rather than what they explain.
LGMay 27, 2021
Times Series Forecasting for Urban Building Energy Consumption Based on Graph Convolutional NetworkYuqing Hu, Xiaoyuan Cheng, Suhang Wang et al.
The world is increasingly urbanizing and the building industry accounts for more than 40% of energy consumption in the United States. To improve urban sustainability, many cities adopt ambitious energy-saving strategies through retrofitting existing buildings and constructing new communities. In this situation, an accurate urban building energy model (UBEM) is the foundation to support the design of energy-efficient communities. However, current UBEM are limited in their abilities to capture the inter-building interdependency due to their dynamic and non-linear characteristics. Those models either ignored or oversimplified these building interdependencies, which can substantially affect the accuracy of urban energy modeling. To fill the research gap, this study proposes a novel data-driven UBEM synthesizing the solar-based building interdependency and spatial-temporal graph convolutional network (ST-GCN) algorithm. Especially, we took a university campus located in downtown Atlanta as an example to predict the hourly energy consumption. Furthermore, we tested the feasibility of the proposed model by comparing the performance of the ST-GCN model with other common time-series machine learning models. The results indicate that the ST-GCN model overall outperforms all others. In addition, the physical knowledge embedded in the model is well interpreted. After discussion, it is found that data-driven models integrated engineering or physical knowledge can significantly improve the urban building energy simulation.