On-site text classification and knowledge mining for large-scale projects construction by integrated intelligent approach

Dan Tian, Mingchao Li, Jonathan Shi, Yang Shen, Shuai Han

Research output: Journal article publicationJournal articleAcademic researchpeer-review

42 Citations (Scopus)


A large-scale project produces a lot of text data during construction commonly achieved as various management reports. Having the right information at the right time can help the project team understand the project status and manage the construction process more efficiently. However, text information is presented in unstructured or semi-structured formats. Extracting useful information from such a large text warehouse is a challenge. A manual process is costly and often times cannot deliver the right information to the right person at the right time. This research proposes an integrated intelligent approach based on natural language processing technology (NLP), which mainly involves three stages. First, a text classification model based on Convolution Neural Network (CNN) is developed to classify the construction on-site reports by analyzing and extracting report text features. At the second stage, the classified construction report texts are analyzed with improved frequency-inverse document frequency (TF-IDF) by mutual information to identify and mine construction knowledge. At the third stage, a relation network based on the co-occurrence matrix of the knowledge is presented for visualization and better understanding of the construction on-site information. Actual construction reports are used to verify the feasibility of this approach. The study provides a new approach for handling construction on-site text data which can lead to enhancing management efficiency and practical knowledge discovery for project management.

Original languageEnglish
Article number101355
JournalAdvanced Engineering Informatics
Publication statusPublished - Aug 2021


  • CNN
  • Knowledge mining
  • Large-scale projects construction
  • Mutual information
  • Text classification
  • TF-IDF

ASJC Scopus subject areas

  • Information Systems
  • Artificial Intelligence


Dive into the research topics of 'On-site text classification and knowledge mining for large-scale projects construction by integrated intelligent approach'. Together they form a unique fingerprint.

Cite this