Modelling of diesel engine performance using advanced machine learning methods under scarce and exponential data set

Ka In Wong, Pak Kin Wong, Chun Shun Cheung, Chi Man Vong

Research output: Journal article publicationJournal articleAcademic researchpeer-review

37 Citations (Scopus)


Traditional methods on creating diesel engine models include the analytical methods like multi-zone models and the intelligent based models like artificial neural network (ANN) based models. However, those analytical models require excessive assumptions while those ANN models have many drawbacks such as the tendency to overfitting and the difficulties to determine the optimal network structure. In this paper, several emerging advanced machine learning techniques, including least squares support vector machine (LS-SVM), relevance vector machine (RVM), basic extreme learning machine (ELM) and kernel based ELM, are newly applied to the modelling of diesel engine performance. Experiments were carried out to collect sample data for model training and verification. Limited by the experiment conditions, only 24 sample data sets were acquired, resulting in data scarcity. Six-fold cross-validation is therefore adopted to address this issue. Some of the sample data are also found to suffer from the problem of data exponentiality, where the engine performance output grows up exponentially along the engine speed and engine torque. This seriously deteriorates the prediction accuracy. Thus, logarithmic transformation of dependent variables is utilized to pre-process the data. Besides, a hybrid of leave-one-out cross-validation and Bayesian inference is, for the first time, proposed for the selection of hyperparameters of kernel based ELM. A comparison among the advanced machine learning techniques, along with two traditional types of ANN models, namely back propagation neural network (BPNN) and radial basis function neural network (RBFNN), is conducted. The model evaluation is made based on the time complexity, space complexity, and prediction accuracy. The evaluation results show that kernel based ELM with the logarithmic transformation and hybrid inference is far better than basic ELM, LS-SVM, RVM, BPNN and RBFNN, in terms of prediction accuracy and training time.
Original languageEnglish
Pages (from-to)4428-4441
Number of pages14
JournalApplied Soft Computing Journal
Issue number11
Publication statusPublished - 1 Jan 2013


  • Data exponentiality
  • Data scarcity
  • Diesel engine modelling
  • Engine performance
  • Hybrid inference
  • Kernel based extreme learning machine

ASJC Scopus subject areas

  • Software


Dive into the research topics of 'Modelling of diesel engine performance using advanced machine learning methods under scarce and exponential data set'. Together they form a unique fingerprint.

Cite this