Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks
- Resource Type
- Conference
- Authors
- Boillet, Melodie; Kermorvant, Christopher; Paquet, Thierry
- Source
- 2020 25th International Conference on Pattern Recognition (ICPR) Pattern Recognition (ICPR), 2020 25th International Conference on. :2134-2141 Jan, 2021
- Subject
- Computing and Processing
Signal Processing and Analysis
Training
Measurement
Image segmentation
Analytical models
Text analysis
Layout
Neural networks
Document Layout Analysis
Historical document
Fully Convolutional Network
Deep Learning
- Language
In this paper, we introduce a fully convolutional network for the document layout analysis task. While state-of-the-art methods are using models pre-trained on natural scene images, our method Doc-UFCN relies on a U-shaped model trained from scratch for detecting objects from historical documents. We consider the line segmentation task and more generally the layout analysis problem as a pixel-wise classification task then our model outputs a pixel-labeling of the input images. We show that Doc-UFCN outperforms state-of-the-art methods on various datasets and also demonstrate that the pre-trained parts on natural scene images are not required to reach good results. In addition, we show that pre-training on multiple document datasets can improve the performances. We evaluate the models using various metrics to have a fair and complete comparison between the methods.