This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Last update: Dec 30, 2022

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.

Thanks for the author's (@whai362) awesome work!

Installation

Any version of tensorflow version > 1.0 should be ok.
python 2 or 3 will be ok.

Download

trained on ICDAR 2015 (training set) + ICDAR2017 MLT (training set):

baiduyun extract code: pffd

google drive

This model is not as good as article's, it's just a reference. You can finetune on it or you can do a lot of optimization based on this code.

Database	Precision (%)	Recall (%)	F-measure (%)
ICDAR 2015(val)	74.61	80.93	77.64

Train

If you want to train the model, you should provide the dataset path, in the dataset path, a separate gt text file should be provided for each image, and make sure that gt text and image file have the same names.

Then run train.py like:

python train.py --gpu_list=0 --input_size=512 --batch_size_per_gpu=8 --checkpoint_path=./resnet_v1_50/ \
--training_data_path=./data/ocr/icdar2015/

If you have more than one gpu, you can pass gpu ids to gpu_list(like --gpu_list=0,1,2,3)

Note:

right now , only support icdar2017 data format input, like (116,1179,206,1179,206,1207,116,1207,"###"), but you can modify data_provider.py to support polygon format input
Already support polygon shrink by using pyclipper module
this re-implementation is just for fun, but I'll continue to improve this code.
re-implementation pse algorithm by using c++ (if you use python2, just run it, if python3, please replace python-config with python3-config in makefile)

Test

run eval.py like:

python eval.py --test_data_path=./tmp/images/ --gpu_list=0 --checkpoint_path=./resnet_v1_50/ \
--output_dir=./tmp/

a text file and result image will be then written to the output path.

Examples

About issues

If you encounter any issue check issues first, or you can open a new issue.

Reference

Acknowledge

@rkshuai found a bug about concat features in model.py.

If this repository helps you，please star it. Thanks.

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Related tags

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

Installation

Download

Train

Test

Examples

About issues

Reference

Acknowledge

Owner

Michael liu

EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)

caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

Fine tuning keras-ocr python package with custom synthetic dataset from scratch

Generic framework for historical document processing

SemTorch

Convert scans of handwritten notes to beautiful, compact PDFs

A tool to make dumpy among us GIFS

Python Computer Vision application that allows users to draw/erase on the screen using their webcam.

A webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV.

Visual Attention based OCR

Um simples projeto para fazer o reconhecimento do captcha usado pelo jogo bombcrypto

Deep learning based page layout analysis

OCR of Chicago 1909 Renumbering Plan

Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

第一届西安交通大学人工智能实践大赛（2018AI实践大赛--图片文字识别）第一名；仅采用densenet识别图中文字

Sort By Face

CellProfiler is a open-source application for biological image analysis

QED-C: The Quantum Economic Development Consortium provides these computer programs and software for use in the fields of quantum science and engineering.