An accurate and interpretable deep learning model for environmental properties prediction using hybrid molecular representations

Jun Zhang, Qin Wang, Yang Su, Saimeng Jin, Jingzheng Ren, Mario Eden, Weifeng Shen

Research output: Journal article publicationJournal articleAcademic researchpeer-review

20 Citations (Scopus)


Lipophilicity, as quantified by the decimal logarithm of the octanol–water partition coefficient (log KOW), is an essential environmental property. Deep neural networks (DNNs) based quantitative structure–property relationship (QSPR) studies have received more and more attention because of their excellent performance for prediction. However, the black-box nature of DNNs limits the application range where interpretability is essential. Hence, this study aims to develop an accurate and interpretable deep neural network (AI-DNN) model for log KOW prediction. A hybrid method of molecular representation was employed to guarantee the accuracy of the proposed AI-DNN model. The hybrid molecular representations are able to integrate the directed message passing neural networks (D-MPNNs) learned molecular representations and the fixed molecule-level features of CDK descriptors, and can capture both the local and the global features of overall molecule. The performance analysis shows that the proposed QSPR model exhibits promising predictive accuracy and discriminative power in the structural isomers and stereoisomers. Moreover, the Monte Carlo Tree Search (MCTS) approach was used to interpret the proposed AI-DNN model by identifying the molecular substructures contributed to the lipophilicity. This interpretability can be applied to critical fields where there is a high demand for interpretable deep networks, such as green solvent design and drug discovery.
Original languageEnglish
Article numbere17634
Number of pages13
JournalAICHE Journal
Issue number6
Publication statusPublished - Jun 2022


  • QSPR
  • deep learning network
  • interpretability
  • lipophilicity
  • message-passing neural network

ASJC Scopus subject areas

  • Biotechnology
  • Environmental Engineering
  • General Chemical Engineering


Dive into the research topics of 'An accurate and interpretable deep learning model for environmental properties prediction using hybrid molecular representations'. Together they form a unique fingerprint.

Cite this