Document image template matching based on component block list

Hanchuan Peng, Fuhui Long, Zheru Chi, Wan Chi Siu

Research output: Journal article publicationJournal articleAcademic researchpeer-review

15 Citations (Scopus)

Abstract

Document image matching is the key technique for document image registration and retrieval. In this paper, a new matching method based on document component block list (CBL) is proposed. A document image is firstly parsed into a number of component blocks that are defined as non-adherent rectangular areas of substantial document contents. Then these blocks are organized as a list, on which several matching operations are defined. The template image that is most similar to the querying document image is selected as the matching result. Our method can effectively make use of the local information of each page component block and the global information of document page layout. We investigate the method with large-scale document template image database. Our method manifests good matching accuracy and good robustness to image distortion, filled-in text, and noises.
Original languageEnglish
Pages (from-to)1033-1042
Number of pages10
JournalPattern Recognition Letters
Volume22
Issue number9
DOIs
Publication statusPublished - 1 Jul 2001

Keywords

  • Document database
  • Document processing
  • Image matching
  • Image registration
  • Image retrieval

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Cite this