A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

Overview

awesome-deep-text-detection-recognition

A curated list of awesome deep learning based papers on text detection and recognition.

Text Detection

  • Papers are sorted by published date.
  • IC is shorts for ICDAR.
  • Score is F1-score for localization task.
    • (L) stands for score in leader-board.
    • If the reported score in leader-board is somewhat different from the paper, (L) is provided.
  • *CODE means official code and CODE(M) means that traiend model is provided.
Conf. Date Title IC13 IC15 Resources
'14-ECCV 14/10/07 Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees
15-CVPR 15/06/01 Symmetry-based text line detection in natural scenes 0.8043 PRJ
CODE
'16-TIP 15/10/12 Text-Attentional Convolutional Neural Networks for Scene Text Detection 0.8165
'15-ICCV 15/12/13 Text Flow : A Unified Text Detection System in Natural Scene Images 0.8025
'16-arXiv 16/03/31 Accurate Text Localization in Natural Image with Cascaded Convolutional TextNetwork 0.86
'16-CVPR 16/04/14 Multi-Oriented Text Detection with Fully Convolutional Networks 0.83 0.54 *TORCH(M)
'16-CVPR 16/04/22 Synthetic Data for Text Localisation in Natural Images 0.847
(L)0.8359
CODE
DB
'16-arXiv 16/06/29 Scene Text Detection Via Holistic, Multi-Channel Prediction 0.8433 0.6477
'16-ECCV 16/09/12 Detecting Text in Natural Image with Connectionist Text Proposal Network 0.8215 0.6085 *CAFFE(M)
CAFFE
TF(M)
TF
DEMO
BLOG(CH)
'17-AAAI 16/11/21 TextBoxes: A fast text detector with a single deep neural network 0.85
(L)0.8767
*CAFFE(M)
TF
BLOG(KR)
'18-TM 17/03/03 Arbitrary-Oriented Scene Text Detection via Rotation Proposals 0.9125 0.8020 *CAFFE
'17-CVPR 17/03/04 Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection 0.7064
'17-CVPR 17/03/19 Detecting Oriented Text in Natural Images by Linking Segments 0.853 0.75
(L)0.7636
*TF(M)
TF(M)
SLIDE
VIDEO
'17-arXiv 17/03/24 Deep Direct Regression for Multi-Oriented Scene Text Detection 0.86 0.81
'17-arXiv 17/04/03 Cascaded Segmentation-Detection Networks for Word-Level Text Spotting 0.86 0.71
'17-CVPR 17/04/11 EAST: An Efficient and Accurate Scene Text Detector 0.8072
(L)0.8038
TF(M)
TF
PYTORCH(M)
PYTORCH
DEMO
KERAS(M)
VIDEO
'17-ICIP 17/05/15 WordFence: Text Detection in Natural Images with Border Awareness 0.86
'17-arXiv 17/06/30 R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection 0.8773 0.8254 TF(M)
CAFFE(M)
'17-CVPR 17/07/21 Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting In The Wild 0.85 0.63
'17-arXiv 17/08/17 Deep Scene Text Detection with Connected Component Proposals 0.919
'17-ICCV 17/08/22 WordSup: Exploiting Word Annotations for Character based Text Detection 0.9064 0.7816
'17-ICCV 17/09/01 Single Shot Text Detector with Regional Attention 0.8704 0.7691 *CAFFE(M)
PYTORCH
VIDEO
'17-arXiv 17/09/11 Fused Text Segmentation Networks for Multi-oriented Scene Text Detection 0.8414
'17-ICCV 17/10/13 WeText: Scene Text Detection under Weak Supervision 0.869
(L)0.8313
'17-ICCV 17/10/22 Self-organized Text Detection with Minimal Post-processing via Border Learning 0.84 *KERAS(M)
'17-ICDAR 17/11/11 Deep Residual Text Detection Network for Scene Text 0.9117
(L)0.8925
'18-AAAI 17/11/12 Feature Enhancement Network: A Refined Scene Text Detector 0.9161
'17-arXiv 17/11/30 ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene 0.759
'18-AAAI 18/01/04 PixelLink: Detecting Scene Text via Instance Segmentation 0.881 0.8519 *TF(M) TF
'18-CVPR 18/01/05 FOTS: Fast Oriented Text Spotting with a Unified Network 0.925 0.8984 PYTORCH
PYTORCH
VIDEO
'18-TIP 18/01/09 TextBoxes++: A Single-Shot Oriented Scene Text Detector 0.88 0.829
(L)0.8475
*CAFFE(M)
'18-CVPR 18/02/27 Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation 0.88 0.843 *PYTORCH(M)
'18-CVPR 18/03/09 An end-to-end TextSpotter with Explicit Alighment and Attention 0.9 0.87 *CAFFE(M)
'18-CVPR 18/03/14 Rotation-Sensitive Regression for Oriented Scene Text Detection 0.89 0.838 *CAFFE(M)
'18-arXiv 18/04/08 Detecting Multi-Oriented Text with Corner-based Region Proposals 0.876 0.845 *CAFFE(M)
'18-arXiv 18/04/24 An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches 0.92 0.86
'18-IJCAI 18/05/03 IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection 0.9047
'18-arXiv 18/06/07 Shape Robust Text Detection with Progressive Scale Expansion Network 0.8721 PRJ
'18-ECCV 18/07/04 TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes 0.826 PYTORCH
'18-ECCV 18/07/06 Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes 0.917 0.86
'18-ECCV 18/07/10 Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping 0.892
'19-AAAI 18/11/21 Scene Text Detection with Supervised Pyramid Context Network 0.921 0.872
'19-TIP 18/12/04 TextField: Learning A Deep Direction Field for Irregular Scene Text Detection 0.824 *CAFFE(M)
'19-CVPR 19/03/21 Towards Robust Curve Text Detection with Conditional Spatial Expansion
'19-CVPR 19/03/28 Shape Robust Text Detection with Progressive Scale Expansion Network 0.857 TF(M)
'19-CVPR 19/04/03 Character Region Awareness for Text Detection 0.952 0.869 *PYTORCH(M)
VIDEO
PYTORCH
TF(M)
KERAS
BLOG_CH
BLOG_KR
BLOG_KR
BLOG_KR
'19-CVPR 19/04/13 Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes Screen reader support enabled 0.877
'19-CVPR 19/06/16 Learning Shape-Aware Embedding for Scene Text Detection 0.877
'19-CVPR 19/06/16 Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation 0.917 0.876
'19-ICCV 19/08/16 Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network 0.829
'19-ICCV 19/09/02 Geometry Normalization Networks for Accurate Scene Text Detection 0.8852
'19-AAAI 19/11/20 Real-time Scene Text Detection with Differentiable Binarization 0.847

Text Recognition

  • Papers are sorted by published date.
  • IC is shorts for ICDAR.
  • Score is word-accuracy for recognition task.
    • For results on IC03, IC13, and IC15 dataset, papers used different numbers of samples per paper,
      but we did not distinguish between them
  • *CODE means official code and CODE(M) means that trained model is provided.
Conf. Date Title SVT IIIT5k IC03 IC13 Resources
'15-ICLR 14/12/18 Deep structured output learning for unconstrained text recognition 0.717 0.896 0.818 TF
SLIDE
VIDEO
'16-IJCV 15/05/07 Reading text in the wild with convolutional neural networks 0.807 0.933 0.908 KERAS
'16-AAAI 15/06/14 Reading Scene Text in Deep Convolutional Sequences
'17-TPAMI 15/07/21 An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition 0.808 0.782 0.894 0.867 TORCH(M)
TF
TF
TF
TF
PYTORCH
PYTORCH(M)
BLOG(KR)
'16-CVPR 16/03/09 Recursive Recurrent Nets with Attention Modeling for OCR in the Wild 0.807 0.784 0.887 0.9
'16-CVPR 16/03/12 Robust scene text recognition with automatic rectification 0.819 0.819 0.901 0.886 PYTORCH
PYTORCH
'16-CVPR 16/06/27 CNN-N-Gram for Handwriting Word Recognition 0.8362 VIDEO
'16-BMVC 16/09/19 STAR-Net: A SpaTial Attention Residue Network for Scene Text Recognition 0.836 0.833 0.899 0.891
'17-arXiv 17/07/27 STN-OCR: A single Neural Network for Text Detection and Text Recognition 0.798 0.86 0.903 *MXNET(M)
PRJ
BLOG
'17-IJCAI 17/08/19 Learning to Read Irregular Text with Attention Mechanisms
'17-arXiv 17/09/06 Scene Text Recognition with Sliding Convolutional Character Models 0.765 0.816 0.845 0.852
'17-ICCV 17/09/07 Focusing Attention: Towards Accurate Text Recognition in Natural Images 0.859 0.874 0.942 0.933
'18-CVPR 17/11/12 AON: Towards Arbitrarily-Oriented Text Recognition 0.828 0.87 0.915 TF
'17-NIPS 17/12/04 Gated Recurrent Convolution Neural Network for OCR 0.815 0.808 0.978 *TORCH(M)
'18-AAAI 18/01/04 Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition 0.844 0.836 0.915 0.908
'18-AAAI 18/01/04 SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoder-decoder Network 0.87 0.931 0.929
'18-CVPR 18/05/09 Edit Probability for Scene Text Recognition 0.875 0.883 0.946 0.944
'18-TPAMI 18/06/25 ASTER: An Attentional Scene Text Recognizer with Flexible Rectification 0.936 0.934 0.945 0.918 *TF(M)
PYTORCH
'18-ECCV 18/09/08 Synthetically Supervised Feature Learning for Scene Text Recognition 0.871 0.894 0.947 0.94
'19-AAAI 18/09/18 Scene Text Recognition from Two-Dimensional Perspective 0.821 0.92 0.914
'19-AAAI 18/11/02 Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition 0.845 0.915 0.91 *TORCH(M)
'19-CVPR 18/12/14 ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification 0.902 0.933 0.913 PRJ
'19-PR 19/01/10 MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition 0.883 0.912 0.950 0.924 *PYTORCH(M)
'19-ICCV 19/04/03 What is wrong with scene text recognition model comparisons? dataset and model analysis 0.875 0.949 0.936 *PYTORCH(M)
BLOG_KR
'19-CVPR 19/04/18 Aggregation Cross-Entropy for Sequence Recognition 0.826 0.823 0.921 0.897 *PYTORCH
'19-CVPR 19/06/16 Sequence-to-Sequence Domain Adaptation Network for Robust Text Image Recognition 0.845 0.838 0.921 0.918
'19-ICCV 19/08/06 Symmetry-constrained Rectification Network for Scene Text Recognition 0.889 0.944 0.95 0.939
'20-AAAI 19/12/28 TextScanner: Reading Characters in Order for Robust Scene Text Recognition 0.895 0.926 0.925
'20-AAAI 19/12/21 Decoupled Attention Network for Text Recognition 0.892 0.943 0.95 0.939 *PYTORCH(M)
'20-AAAI 20/02/04 GTC: Guided Training of CTC 0.929 0.955 0.952 0.943

End-to-End Text Recognition

  • Papers are sorted by published date.
  • IC is shorts for ICDAR.
  • Score is F1-score for generic task.
  • *CODE means official code and CODE(M) means that trained model is provided.
Conf. Date Title IC03 IC13 IC15 Resources
'12-ICPR 12/11/11 End-to-end text recognition with convolutional neural networks 0.67 *CODE
'14-ECCV 14/09/06 Deep Features for Text Spotting 0.75 PRJ
MATLAB
'15-IJCV 15/05/07 Reading Text in the Wild with Convolutional Neural Networks 0.70 0.77 KERAS
'15-TPAMI 15/10/30 Real-time Lexicon-free Scene Text Localization and Recognition 0.542 0.156
'16-arXiv 16/04/10 TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild 0.6843 0.4718
(L)0.533
*CAFFE(M)
'17-AAAI 16/11/21 TextBoxes: A fast text detector with a single deep neural network 0.84 TF
*CAFFE(M)
BLOG_KR
'17-ICCV 17/07/13 Towards End-to-end Text Spotting with Convolution Recurrent Neural Network 0.8459 VIDEO
'17-ICCV 17/10/22 Deep TextSpotter An End-to-End Trainable Scene Text Localization and Recognition Framework 0.77 0.47 VIDEO
*CAFFE(M)
'18-CVPR 18/01/05 FOTS: Fast Oriented Text Spotting with a Unified Network 0.8477 0.6533 VIDEO
TF(M)
'18-TIP 18/01/09 TextBoxes++: A Single-Shot Oriented Scene Text Detector 0.8465 0.519 *CAFFE(M)
'18-CVPR 18/03/09 An end-to-end TextSpotter with Explicit Alignment and Attention 0.86 0.63 *CAFFE(M)
'18-TPAMI 18/06/25 ASTER: An Attentional Scene Text Recognizer with Flexible Rectification 0.64 *TF(M)
'18-ECCV 18/07/06 Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes 0.865 0.624
'19-ICCV 19/08/24 Towards Unconstrained End-to-End Text Spotting 0.6994 BLOG_KR
'19-ICCV 19/10/17 Convolutional Character Networks 0.7108 *PYTORCH(M)
'19-ICCV 19/10/27 TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting 0.6537
'20-AAAI 19/11/21 All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting 0.841 0.641
'20-AAAI 20/02/12 Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting 0.858 0.651

Others

  • Papers are sorted by published date.
  • *CODE means official code and CODE(M) means that trained model is provided.
Conf. Date Title Description Resources
'14-NIPS 14/06/09 Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition Dataset PRJ
'17-ECCV 17/02/13 End-to-End Interpretation of the French Street Name Signs Dataset Dataset (FSNS) *TF(M)
'17-arXiv 17/04/11 Attention-based Extraction of Structured Information from Street View Imagery FSNS *TF(M)
TF
TF
LUA
BLOG_KR
'17-CVPR 17/07/21 Unambiguous Text Localization and Retrieval for Cluttered Scenes Text Retrieval
'17-AAAI 17/10/22 Detection and Recognition of Text Embedded in Online Images via Neural Context Models Dataset PRJ
'18-CVPR 17/11/17 Separating Style and Content for Generalized Style Transfer Font Style
'17-arXiv 17/12/06 Detecting Curve Text in the Wild New Dataset and New Solution Dataset (CTW 1500) PRJ
'18-AAAI 17/12/14 SEE: Towards Semi-Supervised End-to-End Scene Text Recognition FSNS PRJ
*CHAINER(M)
'17-CVPR 18/06/07 Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks Document Layout PRJ
'18-CVPR 18/06/19 DocUNet: Document Image Unwarping via A Stacked U-Net Document Dewarping PRJ
'18-CVPR 18/06/19 Document Enhancement using Visibility Detection Document Enhancement PRJ
'18-IJCAI 18/06/22 Multi-Task Handwritten Document Layout Analysis Document Layout
'18-ECCV 18/07/09 Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes Dataset PRJ
'19-AAAI 18/12/03 EnsNet: Ensconce Text in the Wild Text Removal DB
'19-CVPR 18/12/14 Spatial Fusion GAN for Image Synthesis Dataset DB
'19-AAAI 19/01/27 Hierarchical Encoder with Auxiliary Supervision for Table-to-text Generation: Learning Better Representation for Tables TableToText
'19-AAAI 19/01/27 A Radical-aware Attention-based Model for Chinese Text Classification Chinese Character Classification
'19-CVPR 19/02/25 Handwriting Recognition in Low-resource Scripts using Adversarial Learning Handwritting Recognition TF
'19-CVPR 19/03/27 Tightness-aware Evaluation Protocol for Scene Text Detection Evaluation CODE
'19-ICCV 19/05/31 Scene Text Visual Question Answering Dataset ICDAR_DB
'19-CVPR 19/06/16 DynTypo: Example-based Dynamic Text Effects Transfer Text Effects PRJ
VIDEO
'19-CVPR 19/06/16 Typography with Decor: Intelligent Text Style Transfer Text Effects *PYTORCH(M)
'19-CVPR 19/06/16 An Alternative Deep Feature Approach to Line Level Keyword Spotting Kyeword Spotting
'19-ICCV 19/07/23 GA-DAN: Geometry-Aware Domain Adaptation Network for Scene Text Detection and Recognition Domain Adaptation
'19-ICCV 19/09/17 Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning Dataset ICDAR_DB
'19-ICCV 19/10/02 Large-scale Tag-based Font Retrieval with Generative Feature Learning Font Retrieval
'19-ICCV 19/10/27 TextPlace: Visual Place Recognition and Topological Localization Through Reading Scene Texts Place Recognition DB
'19-ICCV 19/10/27 DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks Document Dewarping *PYTORCH(M)

Other lists

Tutorial Materials

Acknowledgment

  • This work is done by OCR team in Clova AI powered by NAVER-LINE. NAVER-LINE is an Asian top internet company and develops Clova, a cloud-based AI-assistant platform.
  • This repository is scheduled to be updated regularly in accordance with schedules of major AI conferences.
BD-ALL-DIGIT - This Is Bangladeshi All Sim Cloner Tools

BANGLADESHI ALL SIM CLONER TOOLS INSTALL TOOL ON TERMUX $ apt update $ apt upgra

MAHADI HASAN AFRIDI 2 Jan 19, 2022
Python bindings for JIGSAW: a Delaunay-based unstructured mesh generator.

JIGSAW: An unstructured mesh generator JIGSAW is an unstructured mesh generator and tessellation library; designed to generate high-quality triangulat

Darren Engwirda 26 Dec 13, 2022
This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

Handwritten Text Recognition (OCR) with MXNet Gluon These notebooks have been created by Jonathan Chung, as part of his internship as Applied Scientis

Amazon Web Services - Labs 422 Jan 03, 2023
A pure pytorch implemented ocr project including text detection and recognition

ocr.pytorch A pure pytorch implemented ocr project. Text detection is based CTPN and text recognition is based CRNN. More detection and recognition me

coura 444 Dec 30, 2022
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022
Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

Tencent YouTu Research 146 Dec 24, 2022
Framework for the Complete Gaze Tracking Pipeline

Framework for the Complete Gaze Tracking Pipeline The figure below shows a general representation of the camera-to-screen gaze tracking pipeline [1].

Pascal 20 Jan 06, 2023
Distilling Knowledge via Knowledge Review, CVPR 2021

ReviewKD Distilling Knowledge via Knowledge Review Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia This project provides an implementation for the

DV Lab 194 Dec 28, 2022
Some Boring Research About Products Recognition 、Duplicate Img Detection、Img Stitch、OCR

Products Recognition 介绍 商品识别,围绕在复杂的商场零售场景中,识别出货架图像中的商品信息。主要组成部分: 重复图像检测。【更新进度 4/10】 图像拼接。【更新进度 0/10】 目标检测。【更新进度 0/10】 商品识别。【更新进度 1/10】 OCR。【更新进度 1/10】

zhenjieWang 18 Jan 27, 2022
An OCR evaluation tool

dinglehopper dinglehopper is an OCR evaluation tool and reads ALTO, PAGE and text files. It compares a ground truth (GT) document page with a OCR resu

QURATOR-SPK 40 Dec 20, 2022
A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

Prince Wang 417 Dec 12, 2022
Polaris is a Face recognition attendance system .

Support Me 🚀 About Polaris 📄 Polaris is a system based on facial recognition with a futuristic GUI design, Can easily find people informations store

XN3UR0N 215 Dec 26, 2022
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.2k Dec 31, 2022
Text language identification using Wikipedia data

Text language identification using Wikipedia data The aim of this project is to provide high-quality language detection over all the web's languages.

Vsevolod Dyomkin 28 Jul 09, 2022
Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Overview This collection demonstrates how to construct and train a deep, bidirectional stacked LSTM using CNN features as input with CTC loss to perfo

Jerod Weinman 489 Dec 21, 2022
caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection Abstract This is a caffe re-implementation of R2CNN: Rotational Region CNN fo

candler 80 Dec 28, 2021
Drowsiness Detection and Alert System

A countless number of people drive on the highway day and night. Taxi drivers, bus drivers, truck drivers, and people traveling long-distance suffer from lack of sleep.

Astitva Veer Garg 4 Aug 01, 2022
A tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background.

EasyLaMa (WIP) This is a tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background. Installation For GP

3 Sep 17, 2022
An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss This is an unofficial implementation of AutoVC based on the official one. The reposi

Chien-yu Huang 27 Jun 16, 2022
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc

dengdan 484 Dec 07, 2022