Segmentation and classification of mixed text/graphics/image documents

Kuo Chin Fan, Chi Hwa Liu, Yuan Kai Wang

Research output: Contribution to journalArticlepeer-review

50 Scopus citations

Abstract

In this paper, a feature-based document analysis system is presented which utilizes domain knowledge to segment and classify mixed text/graphics/image documents. In our approach, we first perform a run-length smearing operation followed by the stripe merging procedure to segment the blocks embedded in a document. The classification task is then performed based on the domain knowledge induced from the primitives associated with each type of medium. Proper use of domain knowledge is proved to be effective in accelerating the segmentation speed and decreasing the classification error. The experimental study reveals the feasibility of the new technique in segmenting and classifying mixed text/graphics/image documents.

Original languageEnglish
Pages (from-to)1201-1209
Number of pages9
JournalPattern Recognition Letters
Volume15
Issue number12
DOIs
StatePublished - Dec 1994

Keywords

  • Block classification
  • Connectivity histogram
  • Document segmentation
  • Projection feature

Fingerprint

Dive into the research topics of 'Segmentation and classification of mixed text/graphics/image documents'. Together they form a unique fingerprint.

Cite this