Simultaneous Optimisation of Image Quality Improvement and Text Content Extraction from Scanned Documents
- Resource Type
- Conference
- Authors
- Mujumdar, Shashank; Gupta, Nitin; Jain, Abhinav; Burdick, Douglas
- Source
- 2019 International Conference on Document Analysis and Recognition (ICDAR) ICDAR Document Analysis and Recognition (ICDAR), 2019 International Conference on. :1169-1174 Sep, 2019
- Subject
- Computing and Processing
Optical character recognition software
Image resolution
Image quality
Measurement
Training
Standards
Optimization
Optical Character Recognition
Image Quality Improvement
Text Extraction
Convolutional Neural Networks
Image Super Resolution
- Language
- ISSN
- 2379-2140
Convolutional neural networks are shown to achieve breakthrough performance for the task of single image super resolution (SISR) for natural images. These state-of-the-art (SOA) networks have been adapted to the task of single text image super resolution and have been shown to boost the optical character recognition (OCR) performance. However, these approaches depend on variations of the standard mean squared error (MSE) loss in order to train the SR network for improving the text image quality which does not guarantee optimal OCR performance. In this paper, we propose to combine the OCR performance into the loss function during network training. This results in the generation of high resolution text images that achieve high OCR performance that is comparable to the ground truth high-resolution text images and surpassing those of the SOA baseline results. We define novel intuitive metrics to capture the improvement in the OCR performance and provide extensive experiments to qualitatively and quantitatively assess improvement in the results of our proposed approach against the SOA baselines on the standard UNLV dataset.