An efficient approach for compound identification based on the frequency features of mass spectra

Zhan Li Sun, Kin Man Lam, Jun Zhang

Research output: Journal article publicationJournal articleAcademic researchpeer-review

1 Citation (Scopus)

Abstract

Similarity-measure-based spectrum matching is an effective approach to chemical compound identification. When the sizes of both the query library and the reference library become increasingly large, most existing spectrum-matching methods encounter a seriously heavy computation burden. In this paper, an effective and efficient compound-identification approach is proposed based on the frequency features of mass spectra. Considering the sparsity of mass spectra, a nonzero feature-selection strategy is proposed to decrease the feature dimensionality of mass spectra. To further improve its efficiency, a correlation-based filtering strategy is presented to select the most correlated reference spectra in order to create a reduced reference library. Based on the decreased features and the reduced reference library, the frequency-feature-based composite similarity measures are computed to estimate the chemical abstracts service (CAS) registry numbers of the mass spectra blue in a query library. Due to the reduction in both the feature dimensionality and the reference library, the computation time of the proposed method is only about 6%-11% of that of the existing methods, while the identification performance remains sufficiently competitive. Experimental results demonstrate the feasibility and efficiency of the proposed method.
Original languageEnglish
Pages (from-to)117-123
Number of pages7
JournalChemometrics and Intelligent Laboratory Systems
Volume142
DOIs
Publication statusPublished - 5 Mar 2015

Keywords

  • Discrete Fourier transform
  • Similarity measure
  • Spectrum matching

ASJC Scopus subject areas

  • Analytical Chemistry
  • Software
  • Computer Science Applications
  • Process Chemistry and Technology
  • Spectroscopy

Fingerprint

Dive into the research topics of 'An efficient approach for compound identification based on the frequency features of mass spectra'. Together they form a unique fingerprint.

Cite this