Machine-learning paradigms for selecting ecologically significant input variables

Nitin Muttil, Kwok Wing Chau

Research output: Journal article publicationJournal articleAcademic researchpeer-review

146 Citations (Scopus)

Abstract

Harmful algal blooms, which are considered a serious environmental problem nowadays, occur in coastal waters in many parts of the world. They cause acute ecological damage and ensuing economic losses, due to fish kills and shellfish poisoning as well as public health threats posed by toxic blooms. Recently, data-driven models including machine-learning (ML) techniques have been employed to mimic dynamics of algal blooms. One of the most important steps in the application of a ML technique is the selection of significant model input variables. In the present paper, we use two extensively used ML techniques, artificial neural networks (ANN) and genetic programming (GP) for selecting the significant input variables. The efficacy of these techniques is first demonstrated on a test problem with known dependence and then they are applied to a real-world case study of water quality data from Tolo Harbour, Hong Kong. These ML techniques overcome some of the limitations of the currently used techniques for input variable selection, a review of which is also presented. The interpretation of the weights of the trained ANN and the GP evolved equations demonstrate their ability to identify the ecologically significant variables precisely. The significant variables suggested by the ML techniques also indicate chlorophyll-a (Chl-a) itself to be the most significant input in predicting the algal blooms, suggesting an auto-regressive nature or persistence in the algal bloom dynamics, which may be related to the long flushing time in the semi-enclosed coastal waters. The study also confirms the previous understanding that the algal blooms in coastal waters of Hong Kong often occur with a life cycle of the order of 1-2 weeks.
Original languageEnglish
Pages (from-to)735-744
Number of pages10
JournalEngineering Applications of Artificial Intelligence
Volume20
Issue number6
DOIs
Publication statusPublished - 1 Sep 2007

Keywords

  • Artificial neural networks
  • Data-driven models
  • Genetic programming
  • Harmful algal blooms
  • Hong Kong
  • Machine-learning techniques
  • Red tides
  • Tolo Harbour
  • Water quality modelling

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Cite this