학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Scene Text Recognition with Multi-Encoders

Resource Type: Conference
Authors: Yao Wang; Jong-Eun Ha
Source: 제어로봇시스템학회 국제학술대회 논문집. 2022-11 2022(11):1615-1620
Subject: Scene text recognition
Transformer
Deep learning
Convolutional neural network
Language: Korean
ISSN: 2005-4750

Online Access

Full Text (DBPIA)

초록

Although text recognition has significantly evolved over the years, the current models still have huge challenges, especially for irregular text images, such as complex backgrounds, curved text, diverse fonts, distortions, etc. Currently, CNN-based text recognition networks have shown good performance but still face the above challenges. Recently, feature extractor based on transformer has shown excellent advantages for global feature extraction on images. Especially in irregular text images, which can use self-attention to establish the information connection of each part of the image, which can also reduce the influence of the irregular distribution of characters. Therefore, this paper proposes MESTR(Multi-Encoders Scene Text Recognition) that combines a CNN-based[1][2][6] feature extractor and a transformer-based feature extractor. MESTR can extract local and global features of text images at the same time and then integrate global features into local features. During training, we used CTC[6] as guide training in the decoder part, as the compensation training strategy for attentional decoder. Experimental results demonstrate that the proposed MESTR shows competitive results on all seven benchmarks. At the same time, we provide ablation experiments to show the effectiveness of the improved part on the text recognition model.

공지

DAU Library

학술논문

요약정보

Scene Text Recognition with Multi-Encoders

Online Access

초록