Efficient extraction of non-negative latent factors from high-dimensional and sparse matrices in industrial applications

Xin Luo, Mingsheng Shang, Shuai Li

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

18 Citations (Scopus)

Abstract

High-dimensional and sparse (HiDS) matrices are commonly encountered in many big data-related industrial applications like recommender systems. When acquiring useful patterns from them, non-negative matrix factorization (NMF) models have proven to be highly effective because of their fine representativeness of non-negative data. However, current NMF techniques suffer from a) inefficiency in addressing HiDS matrices, and b) constrained training schemes lack of flexibility, extensibility and adaptability. To address these issues, this work proposes to factorize industrial-size sparse matrices via a novel Inherently Non-negative Latent Factor (INLF) model. It connects the output factors and decision variables via a single-element-dependent sigmoid function, thereby innovatively removing the non-negativity constraints from its training process without impacting the solution accuracy. Hence, its training process is unconstrained, highly flexible and compatible with general learning schemes. Experimental results on five HiDS matrices generated by industrial applications indicate that INLF is able to acquire non-negative latent factors from them in a more efficient manner than any existing method does.

Original languageEnglish
Title of host publicationProceedings - 16th IEEE International Conference on Data Mining, ICDM 2016
EditorsFrancesco Bonchi, Xindong Wu, Ricardo Baeza-Yates, Josep Domingo-Ferrer, Zhi-Hua Zhou
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages311-319
Number of pages9
ISBN (Electronic)9781509054725
DOIs
Publication statusPublished - 31 Jan 2017
Event16th IEEE International Conference on Data Mining, ICDM 2016 - Barcelona, Catalonia, Spain
Duration: 12 Dec 201615 Dec 2016

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Conference

Conference16th IEEE International Conference on Data Mining, ICDM 2016
CountrySpain
CityBarcelona, Catalonia
Period12/12/1615/12/16

Keywords

  • Big data
  • High-dimensional and sparse matrices
  • Inherently non-negative
  • Latent factor
  • Missing-data estimation
  • Non-negative factorization
  • Unconstrained

ASJC Scopus subject areas

  • Engineering(all)

Cite this