Abstract
Document image matching is the key technique for document image registration and retrieval. In this paper, a new matching method based on document component block list (CBL) is proposed. A document image is firstly parsed into a number of component blocks that are defined as non-adherent rectangular areas of substantial document contents. Then these blocks are organized as a list, on which several matching operations are defined. The template image that is most similar to the querying document image is selected as the matching result. Our method can effectively make use of the local information of each page component block and the global information of document page layout. We investigate the method with large-scale document template image database. Our method manifests good matching accuracy and good robustness to image distortion, filled-in text, and noises.
Original language | English |
---|---|
Pages (from-to) | 1033-1042 |
Number of pages | 10 |
Journal | Pattern Recognition Letters |
Volume | 22 |
Issue number | 9 |
DOIs | |
Publication status | Published - 1 Jul 2001 |
Keywords
- Document database
- Document processing
- Image matching
- Image registration
- Image retrieval
ASJC Scopus subject areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence