Data Quality-Aware Mixed-Precision Quantization via Hybrid Reinforcement Learning

Yingchun Wang, Song Guo, Jingcai Guo, Yuanhong Zhang, Weizhan Zhang, Qinghua Zheng, Jie Zhang

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

Mixed-precision quantization mostly predetermines the model bit-width settings before actual training due to the non-differential bit-width sampling process, obtaining suboptimal performance. Worse still, the conventional static quality-consistent training setting, i.e., all data is assumed to be of the same quality across training and inference, overlooks data quality changes in real-world applications which may lead to poor robustness of the quantized models. In this article, we propose a novel data quality-aware mixed-precision quantization framework, dubbed DQMQ, to dynamically adapt quantization bit-widths to different data qualities. The adaption is based on a bit-width decision policy that can be learned jointly with the quantization training. Concretely, DQMQ is modeled as a hybrid reinforcement learning (RL) task that combines model-based policy optimization with supervised quantization training. By relaxing the discrete bit-width sampling to a continuous probability distribution that is encoded with few learnable parameters, DQMQ is differentiable and can be directly optimized end-to-end with a hybrid optimization target considering both task performance and quantization benefits. Trained on mixed-quality image datasets, DQMQ can implicitly select the most proper bit-width for each layer when facing uneven input qualities. Extensive experiments on various benchmark datasets and networks demonstrate the superiority of DQMQ against existing fixed/mixed-precision quantization methods.

Original languageEnglish
Pages (from-to)1-14
Number of pages14
JournalIEEE Transactions on Neural Networks and Learning Systems
DOIs
Publication statusPublished - Jun 2024

Keywords

  • Accuracy
  • Bit-width decision
  • Computational modeling
  • Data integrity
  • Data models
  • data quality
  • model compression
  • network quantization
  • Optimization
  • Quantization (signal)
  • reinforcement learning (RL)
  • Training

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Data Quality-Aware Mixed-Precision Quantization via Hybrid Reinforcement Learning'. Together they form a unique fingerprint.

Cite this