Page segmentation and content classification for automatic document image processing

S. K. Yip, Zheru Chi

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

7 Citations (Scopus)

Abstract

Page segmentation and image content classification is an important step for automatic document image processing including mixed-type document image compression, form and check reading, and mail sorting. In this paper, we first propose an enhanced background thinning based page segmentation approach. We then present a hierarchical approach for the classification of the segmented sub-images into one of two categories: text and picture. The approach combines a cross-correlation method, the Komogrove complexity measure, and a neural network classijier in order to achieve both efficiency and high accuracy. Our approach has been tested on a number of mixed-type document images with good results.
Original languageEnglish
Title of host publicationProceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2001
Pages279-282
Number of pages4
Publication statusPublished - 1 Dec 2001
Event2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2001 - Hong Kong, Hong Kong
Duration: 2 May 20014 May 2001

Conference

Conference2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2001
Country/TerritoryHong Kong
CityHong Kong
Period2/05/014/05/01

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Page segmentation and content classification for automatic document image processing'. Together they form a unique fingerprint.

Cite this