Automated system for text detection in individual video images

Department

Program

Citation of Original Publication

Du, Yingze, Chein-I. Chang, and Paul D. Thouin. “Automated System for Text Detection in Individual Video Images.” Journal of Electronic Imaging 12, no. 3 (July 2003): 410–22. https://doi.org/10.1117/1.1584050.

Rights

This work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
Public Domain

Subjects

Abstract

Text detection in video images is a challenging research problem because of the poor spatial resolution and complex background, which may contain a variety of colors. An automated systemfor text detection in video images is presented. It makes use of four modules to implement a series of processes to extract text regions from video images. The first module, called the multistage pulse code modulation (MPCM) module, is used to locate potential text regions in color video images. It converts a video image to a coded image, with each pixel encoded by a priority code ranging from 7 down to 0 in accordance with its priority, and further produces a binary thresholded image, which segments potential text regions from the background. The second module, called the text region detection module, applies a sequence of spatial filters to remove noisy regions and eliminate regions that are unlikely to contain text. The third module, called the text box finding module, merges text regions and produces boxes that are likely to contain text. Finally, the fourth module, called the optical character recognition (OCR) module, eliminates the text boxes that produce no OCR output. An extensive set of experiments is conducted and demonstrates that the proposed system is effective in detecting text in a wide variety of video images.