Insights into ensemble learning-based data-driven model for safety-related property of chemical substances

Zihao Wang, Huaqiang Wen, Yang Su, Weifeng Shen (Corresponding Author), Jingzheng Ren, Yingjie Ma, Jie Li

Research output: Journal article publicationJournal articleAcademic researchpeer-review

20 Citations (Scopus)


Risk assessment relying on characteristics of chemicals in process industries can prevent accidents caused by flammable and combustible liquids and gases. Whereas its application is limited by the lack of safety-related properties for abundant chemicals of interest, which promotes the demand for accurate predictive models to evaluate inherent safety implications of chemicals. In this research, staking-based ensemble learning is comprehensively investigated on safety-related properties to assist the risk assessment. Based on molecular structure-based features, individual and ensemble models are built and compared using heterogeneous machine learning (ML) methods. The systematic ensemble learning workflow is deployed by a case on flash points of chemical substances. Several representative ML methods including multiple linear regression, extreme learning machine, feedforward neural network, and support vector machine are taken into consideration. As it turns out, ensemble models exhibit improved predictive accuracy than standard individual ML models, indicating the effectiveness of ensemble learning on improving model performance. Moreover, extremal evaluations with existing models as well as internal analyses against functional group-based organic compound families and structural feature-based data-driven categories are carried out to identify model reliability. Ensemble learning is demonstrated as an effective approach for high-performance predictive modeling in safety-related risk assessments.

Original languageEnglish
Article number117219
Number of pages10
JournalChemical Engineering Science
Issue numberPart A
Publication statusPublished - 2 Feb 2022


  • Flash point
  • Machine learning
  • Molecular feature
  • Predictive modeling

ASJC Scopus subject areas

  • General Chemistry
  • General Chemical Engineering
  • Industrial and Manufacturing Engineering


Dive into the research topics of 'Insights into ensemble learning-based data-driven model for safety-related property of chemical substances'. Together they form a unique fingerprint.

Cite this