Title: |
End-to-end Handwritten Paragraph Text Recognition Using a Vertical Attention Network |
Authors: |
Coquenet, Denis, Chatelain, Clément, Paquet, Thierry |
Source: |
IEEE Transactions on Pattern Analysis and Machine Intelligence 2022 |
Publication Year: |
2020 |
Collection: |
Computer Science |
Subject Terms: |
Computer Science - Computer Vision and Pattern Recognition |
More Details: |
Unconstrained handwritten text recognition remains challenging for computer vision systems. Paragraph text recognition is traditionally achieved by two models: the first one for line segmentation and the second one for text line recognition. We propose a unified end-to-end model using hybrid attention to tackle this task. This model is designed to iteratively process a paragraph image line by line. It can be split into three modules. An encoder generates feature maps from the whole paragraph image. Then, an attention module recurrently generates a vertical weighted mask enabling to focus on the current text line features. This way, it performs a kind of implicit line segmentation. For each text line features, a decoder module recognizes the character sequence associated, leading to the recognition of a whole paragraph. We achieve state-of-the-art character error rate at paragraph level on three popular datasets: 1.91% for RIMES, 4.45% for IAM and 3.59% for READ 2016. Our code and trained model weights are available at https://github.com/FactoDeepLearning/VerticalAttentionOCR. |
Document Type: |
Working Paper |
DOI: |
10.1109/TPAMI.2022.3144899 |
Access URL: |
http://arxiv.org/abs/2012.03868 |
Accession Number: |
edsarx.2012.03868 |
Database: |
arXiv |