TY - JOUR
T1 - GEIN
T2 - An interpretable benchmarking framework towards all building types based on machine learning
AU - Jin, Xiaoyu
AU - Xiao, Fu
AU - Zhang, Chong
AU - Li, Ao
N1 - Funding Information:
The authors gratefully acknowledge the support of this research by the National Key Research and Development Program of China (2021YFE0107400), Hong Kong Scholars Program (XJ2019044) and the Research Grant Council of the Hong Kong SAR (152133/19E).
Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2022/4/1
Y1 - 2022/4/1
N2 - Building energy performance benchmarking is adopted by many countries in the world as an effective tool to reduce energy consumption at city or country level. Machine learning holds a lot of promise for quickly and correctly predicting energy consumption from massive data, thereby it's suitable for large-scale performance assessment. However, there is a severe problem of data imbalance in building types in many datasets. Due to the lack of samples for some types of buildings, unfavorable results, such as low accuracy of prediction, are produced sometimes. Meanwhile, the poor interpretability of machine learning models makes it difficult to promote the benchmarking frameworks based on machine learning. Therefore, this study proposed a novel machine learning based building performance benchmarking framework with improved generalization and interpretability. A reliable and convenient data augmentation approach was established to overcome the data imbalance problem while avoiding the overfitting problem. Superior results were obtained in case studies using three city-level open-source building datasets from two different countries. A complete rating framework was also proposed, with proper explanations of results at sample level. The performance of this rating framework was verified by comparing with other data-driven benchmarking frameworks. Moreover, the importance of variables was quantified and ranked, which can be a significant reference for data collectors and publishers. The results demonstrated that data augmentation can effectively solve the problem of data imbalance, which enables the universality of machine learning based benchmarking on all types of buildings. And the proposed GEIN benchmarking framework can also effectively address the issues of interpretability.
AB - Building energy performance benchmarking is adopted by many countries in the world as an effective tool to reduce energy consumption at city or country level. Machine learning holds a lot of promise for quickly and correctly predicting energy consumption from massive data, thereby it's suitable for large-scale performance assessment. However, there is a severe problem of data imbalance in building types in many datasets. Due to the lack of samples for some types of buildings, unfavorable results, such as low accuracy of prediction, are produced sometimes. Meanwhile, the poor interpretability of machine learning models makes it difficult to promote the benchmarking frameworks based on machine learning. Therefore, this study proposed a novel machine learning based building performance benchmarking framework with improved generalization and interpretability. A reliable and convenient data augmentation approach was established to overcome the data imbalance problem while avoiding the overfitting problem. Superior results were obtained in case studies using three city-level open-source building datasets from two different countries. A complete rating framework was also proposed, with proper explanations of results at sample level. The performance of this rating framework was verified by comparing with other data-driven benchmarking frameworks. Moreover, the importance of variables was quantified and ranked, which can be a significant reference for data collectors and publishers. The results demonstrated that data augmentation can effectively solve the problem of data imbalance, which enables the universality of machine learning based benchmarking on all types of buildings. And the proposed GEIN benchmarking framework can also effectively address the issues of interpretability.
KW - Data augmentation
KW - EUI prediction
KW - GEIN
KW - Interpretable building energy benchmarking
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=85124602035&partnerID=8YFLogxK
U2 - 10.1016/j.enbuild.2022.111909
DO - 10.1016/j.enbuild.2022.111909
M3 - Journal article
AN - SCOPUS:85124602035
SN - 0378-7788
VL - 260
JO - Energy and Buildings
JF - Energy and Buildings
M1 - 111909
ER -