Learning features for predicting OCR accuracy

TitleLearning features for predicting OCR accuracy
Publication TypeConference Papers
Year of Publication2012
AuthorsYe P, Doermann D
Conference NameInternational Conference on Pattern Recognition (ICPR)

In this paper, we present a new method for assessing the quality of degraded document images using unsupervised feature learning. The goal is to build a computational model to automatically predict OCR accuracy of a degraded document image without a reference image. Current approaches for this problem typically rely on hand-crafted features whose design is based on heuristic rules that may not be generalizable. In contrast, we explore an unsupervised feature learning framework to learn effective and efficient features for predicting OCR accuracy. Our experimental results, on a set of historic newspaper images, show that the proposed method outperforms a baseline method which combines features from previous works.