Scene text recognition

Overview

AttentionOCR for Arbitrary-Shaped Scene Text Recognition

Introduction

This is the ranked No.1 tensorflow based scene text spotting algorithm on ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (Latin Only, Latin and Chinese), futhermore, the algorithm is also adopted in ICDAR2019 Robust Reading Challenge on Large-scale Street View Text with Partial Labeling and ICDAR2019 Robust Reading Challenge on Reading Chinese Text on Signboard.

Scene text detection algorithm is modified from Tensorpack FasterRCNN, and we only open source code in this repository for scene text recognition. I upload ICDAR2019 ArT competition model to docker hub, please refer to Docker. For more details, please refer to our arXiv technical report.

Our text recognition algorithm not only recognizes Latin and Non-Latin characters, but also supports horizontal and vertical text recognition in one model. It is convenient for multi-lingual arbitrary-shaped text recognition.

Note that the competition model in docker container as described in our technical report is slightly different from the recognition model trained from this updated repository.

Dependencies

python 3
tensorflow-gpu 1.14
tensorpack 0.9.8
pycocotools

Usage

First download and extract multiple text datasets in base text dir, please refer to dataset.py for dataset preprocess and multiple datasets.

Multiple Datasets

$(base_dir)/lsvt
$(base_dir)/art
$(base_dir)/rects
$(base_dir)/icdar2017rctw

You can also synthesize text recognition data for data augmentation, please refer to TextRecognitionDataGenerator. It is helpful for long text recognition and attention-based language model because you can directly synthesize text images from NLP corpus. Then you should rewrite dataset.py for synthetic text dataset.

$(base_dir)/synthetic_text

Train

First, download pretrained inception v4 checkpoint and put it in ./pretrain folder. Then you can modify your gpu lists in config.py for specified gpus and then run:

python train.py

You can visualize your training steps via tensorboard:

tensorboard --logdir='./checkpoint'

Use ICDAR2019-LSVT, ICDAR2019-ArT, ICDAR2019-ReCTS for default training, you can change it with your own training data.

Evaluation

python eval.py --checkpoint_path=$(Your model path)

Use ICDAR2017RCTW for default evaluation with Normalized Edit Distance metric(1-N.E.D specifically), you can change it with your own evaluation data.

Export

Export checkpoint to tensorflow pb model for inference.

python export.py --pb_path=$(Your tensorflow pb model save path) --checkpoint_path=$(Your trained model path)

Test

Load tensorflow pb model for text recognition.

python test.py --pb_path=$(Your tensorflow pb model save path) --img_folder=$(Your test img folder)

Default use ICDAR2019-ArT for test, you can change it with your own test data.

Visualization

Scene text detection and recognition result:

Scene text recognition attention maps:

To learn more about attention mechanism, please refer to Attention Mechanism in Deep Learning.

Docker

I upload ICDAR2019 scene text recognition model include text detection and recognition to Docker Hub.

After nvidia-docker installed, run:

docker pull zhang0jhon/demo:ocr
docker run -it -p 5000:5000 --gpus all zhang0jhon/demo:ocr bash
cd /ocr/ocr
python flaskapp.py

Then you can test with your data via browser:

$(localhost or remote server ip address):5000

Let's explore how we can extract text from forms

Form Segmentation Let's explore how we can extract text from any forms / scanned pages. Objectives The goal is to find an algorithm that can extract t

Philip Doxakis 42 Jun 05, 2022
基于openpose和图像分类的手语识别项目

手语识别 0、使用到的模型 (1). openpose,作者:CMU-Perceptual-Computing-Lab https://github.com/CMU-Perceptual-Computing-Lab/openpose (2). 图像分类classification,作者:Bubbl

20 Dec 15, 2022
TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection Introduction The code and trained models of: TextField: Learning A Deep

Yukang Wang 101 Dec 12, 2022
📷 Face Recognition using Haar-Cascade Classifier, OpenCV, and Python

Face-Recognition-System Face Recognition using Haar-Cascade Classifier, OpenCV and Python. This project is based on face detection and face recognitio

1 Jan 10, 2022
Code for CVPR 2022 paper "SoftGroup for Instance Segmentation on 3D Point Clouds"

SoftGroup We provide code for reproducing results of the paper SoftGroup for 3D Instance Segmentation on Point Clouds (CVPR 2022) Author: Thang Vu, Ko

Thang Vu 231 Dec 27, 2022
TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

FOTS: Fast Oriented Text Spotting with a Unified Network I am still working on this repo. updates and detailed instructions are coming soon! Table of

Masao Taketani 52 Nov 11, 2022
governance proposal to make fei redeemable for eth

Feil Proposal 🌲 Abstract Migrate all ETH from Fei protocol-controlled value into Yearn ETH Vault. Allow redemptions of outstanding FEI for yvETH. At

13 Mar 31, 2022
Web interface for browsing arXiv papers

Currently, arxivbox considers only major computer vision and machine learning conferences

Ankan Kumar Bhunia 12 Sep 11, 2022
PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

News Python3 implementations of PSENet [1], PAN [2] and PAN++ [3] are released at https://github.com/whai362/pan_pp.pytorch. [1] W. Wang, E. Xie, X. L

1.1k Dec 24, 2022
轻量级公式 OCR 小工具:一键识别各类公式图片,并转换为 LaTeX 格式

QC-Formula | 青尘公式 OCR 介绍 轻量级开源公式 OCR 小工具:一键识别公式图片,并转换为 LaTeX 格式。 支持从 电脑本地 导入公式图片;(后续版本将支持直接从网页导入图片) 公式图片支持 .png / .jpg / .bmp,大小为 4M 以内均可; 支持印刷体及手写体,前

青尘工作室 26 Jan 07, 2023
Library used to deskew a scanned document

Deskew //Note: Skew is measured in degrees. Deskewing is a process whereby skew is removed by rotating an image by the same amount as its skew but in

Stéphane Brunner 273 Jan 06, 2023
CellProfiler is a open-source application for biological image analysis

CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automaticall

CellProfiler 732 Dec 23, 2022
The code for CVPR2022 paper "Likert Scoring with Grade Decoupling for Long-term Action Assessment".

Likert Scoring with Grade Decoupling for Long-term Action Assessment This is the code for CVPR2022 paper "Likert Scoring with Grade Decoupling for Lon

10 Oct 21, 2022
Creating a virtual tv using opencv in python3.

Virtual-TV Creating a virtual tv using opencv in python3. In order to run the code follow the below given steps: Make sure the desired videos which ar

Vamsi 1 Jan 01, 2022
1st place solution for SIIM-FISABIO-RSNA COVID-19 Detection Challenge

SIIM-COVID19-Detection Source code of the 1st place solution for SIIM-FISABIO-RSNA COVID-19 Detection Challenge. 1.INSTALLATION Ubuntu 18.04.5 LTS CUD

Nguyen Ba Dung 170 Dec 21, 2022
make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

Jinpeng 1.5k Dec 28, 2022
Automatic Number Plate Recognition (ANPR) is a highly accurate system capable of reading vehicle number plates without human intervention

ANPR ANPR is therefore the underlying technology used to find a vehicle license/number plate and it, in turn, supplies this information to a next stag

Melih Emin Kılıçoğlu 1 Jan 09, 2022
基于Paddle框架的PSENet复现

PSENet-Paddle 基于Paddle框架的PSENet复现 本项目基于paddlepaddle框架复现PSENet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 AIStudio链接 参考项目: whai362-PSENet 环境配置 本项目

QuanHao Guo 4 Apr 24, 2022
Détection de créneaux de vaccination disponibles pour l'outil ViteMaDose

Vite Ma Dose ! est un outil open source de CovidTracker permettant de détecter les rendez-vous disponibles dans votre département afin de vous faire v

CovidTracker 239 Dec 13, 2022
Controlling the computer volume with your hands // OpenCV

HandsControll-AI Controlling the computer volume with your hands // OpenCV Step 1 git clone https://github.com/Hayk-21/HandsControll-AI.git pip instal

Hayk 1 Nov 04, 2021