TY - JOUR
T1 - Semi-supervised learning based framework for urban level building electricity consumption prediction
AU - Jin, Xiaoyu
AU - Xiao, Fu
AU - Zhang, Chong
AU - Chen, Zhijie
N1 - Funding Information:
The authors gratefully acknowledge the support of this research by the National Key Research and Development Program of China (2021YFE0107400), Hong Kong Scholars Program (XJ2019044) and the Research Grant Council of the Hong Kong SAR (152133/19E).
Publisher Copyright:
© 2022
PY - 2022/12/15
Y1 - 2022/12/15
N2 - The spatial feature of building energy consumption in a city is essential for urban level energy planning and policy making. With the increasing availability of urban level building energy benchmarking datasets, machine learning has shown a powerful capability of making data-driven predictions on urban level building energy consumption. However, the building energy benchmarking datasets usually only cover large buildings, which are not sufficient representations of all buildings in a city. Besides building energy benchmarking datasets, many other urban level open datasets are also valuable to building energy prediction, but they do not contain building energy use data, in other words, they are unlabeled. This study proposes a novel framework based on semi-supervised learning to make effective use of the unlabeled datasets to develop more generic urban level data-driven building energy prediction models, and energy mapping with higher space resolution. The framework consists of preliminary labeling, selection of pseudo labeled samples and predictive modelling. Several machine learning algorithms are proposed and compared for generating pseudo labels of building electricity consumption for unlabeled datasets of small and medium-sized buildings. A selection process consisting of convergence testing and screening is designed to select pseudo labeled samples with high credibility to enlarge the labeled dataset. A novel two-level performance evaluation method is proposed to evaluate the performance of the framework at both urban level and district level to enhance the spatial resolution of the predictions. The framework is implemented to model and map the individual electricity consumptions of all buildings in two years in the districts of New York City using multiple open datasets. The results show significant improvement in terms of prediction accuracy at both levels. In addition, the applicability of the model to various buildings in the city is remarkably enhanced.
AB - The spatial feature of building energy consumption in a city is essential for urban level energy planning and policy making. With the increasing availability of urban level building energy benchmarking datasets, machine learning has shown a powerful capability of making data-driven predictions on urban level building energy consumption. However, the building energy benchmarking datasets usually only cover large buildings, which are not sufficient representations of all buildings in a city. Besides building energy benchmarking datasets, many other urban level open datasets are also valuable to building energy prediction, but they do not contain building energy use data, in other words, they are unlabeled. This study proposes a novel framework based on semi-supervised learning to make effective use of the unlabeled datasets to develop more generic urban level data-driven building energy prediction models, and energy mapping with higher space resolution. The framework consists of preliminary labeling, selection of pseudo labeled samples and predictive modelling. Several machine learning algorithms are proposed and compared for generating pseudo labels of building electricity consumption for unlabeled datasets of small and medium-sized buildings. A selection process consisting of convergence testing and screening is designed to select pseudo labeled samples with high credibility to enlarge the labeled dataset. A novel two-level performance evaluation method is proposed to evaluate the performance of the framework at both urban level and district level to enhance the spatial resolution of the predictions. The framework is implemented to model and map the individual electricity consumptions of all buildings in two years in the districts of New York City using multiple open datasets. The results show significant improvement in terms of prediction accuracy at both levels. In addition, the applicability of the model to various buildings in the city is remarkably enhanced.
KW - Building electricity consumpiton
KW - Credibility measurement
KW - Open data
KW - Semisupervised learning
KW - Urban building energy modeling
UR - http://www.scopus.com/inward/record.url?scp=85142444995&partnerID=8YFLogxK
U2 - 10.1016/j.apenergy.2022.120210
DO - 10.1016/j.apenergy.2022.120210
M3 - Journal article
AN - SCOPUS:85142444995
SN - 0306-2619
VL - 328
JO - Applied Energy
JF - Applied Energy
M1 - 120210
ER -