Predicting building-related carbon emissions: A test of machine learning models

Emmanuel B. Boateng, Emmanuella A. Twumasi, Amos Darko, Mershack O. Tetteh, Albert P.C. Chan

Research output: Chapter in book / Conference proceedingChapter in an edited book (as author)Academic researchpeer-review


This chapter evaluates and compares the performance of six machine-learning (ML) algorithms in predicting China’s building-related carbon emissions. The models took into account five input parameters influencing building-related CO2 emissions: urbanisation, R&D, population size, GDP, and energy use. The study used quarterly data throughout 1971Q1–2014Q4 to develop, calibrate, and validate the models. Each model was developed using 140 observations and validated on 36 observations. In tuning each ML model for comparative purposes, 10-fold with cross-validation approach was used in selecting the optimal hyperparameters and their associated arguments. The results indicate that the random forest (RF) model attained the highest coefficient of determination (R2) of 99.88%, followed by the k-nearest neighbour (KNN) (99.87%), extreme gradient boosting (XGBoost) (99.77%), decision tree (DT) (99.63%), adaptive boosting (AdaBoost) (99.56%), and the support vector regression (SVR) model (97.67%). Overall, the RF algorithm is the best performing ML algorithm in accurately predicting building-related CO2 emissions, whereas the best algorithm in terms of time efficiency is the DT algorithm. The KNN model is highly recommended when practitioners want to have accurate predictions in a timely manner. RF, KNN, and DT models could be added to the toolkits of environmental policymakers to provide high-quality forecasts and patterns of building-related CO2 emissions in an accurate and real-time manner.

Original languageEnglish
Title of host publicationStudies in Computational Intelligence
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages20
Publication statusPublished - Sep 2020

Publication series

NameStudies in Computational Intelligence
ISSN (Print)1860-949X
ISSN (Electronic)1860-9503


  • Adaptive boosting
  • Building emissions
  • Decision tree
  • Extreme gradient boosting
  • K-nearest neighbour
  • Machine learning
  • Predicting
  • Random forest
  • Support vector regression

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this