SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Introduction

This is a PyTorch implementation of "SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training"

The paper propose a novel text detection system termed SelfText Beyond Polygon(SBP) with Bounding Box Supervision(BBS) and Dynamic Self Training~(DST), where training a polygon-based text detector with only a limited set of upright bounding box annotations. As shown in the Figure, SBP achieves the same performance as strong supervision while saving huge data annotation costs.

From more details,please refer to our arXiv paper

Environments

python 3
torch = 1.1.0
torchvision
Pillow
numpy

ToDo List

Dataset

Supported:

model zoo

Supported text detection:

Bounding Box Supervision(BBS)

Train

The training strategy includes three steps: (1) training SASN with synthetic data (2) generating pseudo label on real data based on bounding box annotation with SASN (3) training the detectors(EAST and PSENet) with the pseudo label

training SASN with synthtext or curved synthtext

(TDB)

generating pseudo label on real data with SASN

(TDB)

training EAST or PSENet with the pseudo label

(TDB)

Eval

for example (batchsize=2)

(TDB)

Visualization

Dynamic Self Training

Train

(TDB)

Eval

for example (batchsize=2)

(TDB)

Visualization

Experiments

Bounding Box Supervision

The performance of EAST on ICDAR15

Method	Dataset	Pretrain	precision	recall	f-score
EAST_box	ICDAR15	-	65.8	63.8	64.8
EAST	ICDAR15	-	76.9	77.1	77.0
EAST_pseudo(SynthText)	ICDAR15	-	77.8	78.2	78.0
EAST_box	ICDAR15	SynthText	70.8	72.0	71.4
EAST	ICDAR15	SynthText	82.0	82.4	82.2
EAST_pseudo(SynthText)	ICDAR15	SynthText	81.3	82.2	81.8

The performance of EAST on MSRA-TD500

Method	Dataset	Pretrain	precision	recall	f-score
EAST_box	MSRA-TD500	-	40.49	31.05	35.15
EAST	MSRA-TD500	-	71.76	69.05	70.38
EAST_pseudo(SynthText)	MSRA-TD500	-	71.27	67.54	69.36
EAST_box	MSRA-TD500	SynthText	48.34	42.37	45.16
EAST	MSRA-TD500	SynthText	77.91	76.45	77.17
EAST_pseudo(SynthText)	MSRA-TD500	SynthText	77.42	73.85	75.59

The performance of PSENet on ICDAR15

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	ICDAR15	-	70.17	69.09	69.63
PSENet	ICDAR15	-	81.6	79.5	80.5
PSENet_pseudo(SynthText)	ICDAR15	-	82.9	77.6	80.2
PSENet_box	ICDAR15	SynthText	72.65	74.29	73.46
PSENet	ICDAR15	SynthText	86.42	83.54	84.96
PSENet_pseudo(SynthText)	ICDAR15	SynthText	86.77	83.34	85.02

The performance of PSENet on MSRA-TD500

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	MSRA-TD500	-	47.17	36.90	41.41
PSENet	MSRA-TD500	-	80.86	77.72	79.13
PSENet_pseudo(SynthText)	MSRA-TD500	-	80.32	77.26	78.86
PSENet_box	MSRA-TD500	SynthText	47.45	39.49	43.11
PSENet	MSRA-TD500	SynthText	84.11	84.97	84.54
PSENet_pseudo(SynthText)	MSRA-TD500	SynthText	84.03	84.03	84.03

The performance of PSENet on Total Text

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	Total Text	-	46.5	43.6	45.0
PSENet	Total Text	-	80.4	76.5	78.4
PSENet_pseudo(SynthText)	Total Text	-	80.33	73.54	76.78
PSENet_pseudo(Curved SynthText)	Total Text	-	81.68	74.61	78.0
PSENet_box	Total Text	SynthText	51.94	47.45	49.59
PSENet	Total Text	SynthText	83.4	78.1	80.7
PSENet_pseudo(SynthText)	Total Text	SynthText	81.57	75.54	78.44
PSENet_pseudo(Curved SynthText)	Total Text	SynthText	82.51	77.57	80.0

The visualization of bounding-box annotation and the pseudo labels generated by BBS on Total-Text

links

https://github.com/SakuraRiven/EAST

https://github.com/WenmuZhou/PSENet.pytorch

License

For academic use, this project is licensed under the Apache License - see the LICENSE file for details. For commercial use, please contact the authors.

Citations

Please consider citing our paper in your publications if the project helps your research.

Eamil: [email protected]

Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Related tags

Overview

SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Introduction

Environments

ToDo List

Dataset

model zoo

Bounding Box Supervision(BBS)

Train

training SASN with synthtext or curved synthtext

generating pseudo label on real data with SASN

training EAST or PSENet with the pseudo label

Eval

Visualization

Dynamic Self Training

Train

Eval

Visualization

Experiments

Bounding Box Supervision

The performance of EAST on ICDAR15

The performance of EAST on MSRA-TD500

The performance of PSENet on ICDAR15

The performance of PSENet on MSRA-TD500

The performance of PSENet on Total Text

links

License

Citations

Owner

weijiawu

pixelNeRF: Neural Radiance Fields from One or Few Images

RL agent to play μRTS with Stable-Baselines3

Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

For storing the complete exploration of Visual Question Answering for our B.Tech Project

This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras)

This project aims to be a handler for input creation and running of multiple RICEWQ simulations.

Re-implement CycleGAN in Tensorlayer

TransVTSpotter: End-to-end Video Text Spotter with Transformer

Reviatalizing Optimization for 3D Human Pose and Shape Estimation: A Sparse Constrained Formulation

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions

Captcha-tensorflow - Image Captcha Solving Using TensorFlow and CNN Model. Accuracy 90%+

Covid19-Forecasting - An interactive website that tracks, models and predicts COVID-19 Cases

Supervised Classification from Text (P)

Developing your First ML Workflow of the AWS Machine Learning Engineer Nanodegree Program

Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Implementation of Kronecker Attention in Pytorch

Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

PRTR: Pose Recognition with Cascade Transformers