PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

Related tags

Computer VisionPSENet
Overview

News

#f03c15 Python3 implementations of PSENet [1], PAN [2] and PAN++ [3] are released at https://github.com/whai362/pan_pp.pytorch.

[1] W. Wang, E. Xie, X. Li, W. Hou, T. Lu, G. Yu, and S. Shao. Shape robust text detection with progressive scale expansion network. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 9336–9345, 2019.
[2] W. Wang, E. Xie, X. Song, Y. Zang, W. Wang, T. Lu, G. Yu, and C. Shen. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In Proc. IEEE Int. Conf. Comp. Vis., pages 8440–8449, 2019.
[3] Paper is in preparation.

Shape Robust Text Detection with Progressive Scale Expansion Network

Requirements

  • Python 2.7
  • PyTorch v0.4.1+
  • pyclipper
  • Polygon2
  • OpenCV 3.4 (for c++ version pse)
  • opencv-python 3.4

Introduction

Progressive Scale Expansion Network (PSENet) is a text detector which is able to well detect the arbitrary-shape text in natural scene.

Training

CUDA_VISIBLE_DEVICES=0,1,2,3 python train_ic15.py

Testing

CUDA_VISIBLE_DEVICES=0 python test_ic15.py --scale 1 --resume [path of model]

Eval script for ICDAR 2015 and SCUT-CTW1500

cd eval
sh eval_ic15.sh
sh eval_ctw1500.sh

Performance (new version paper)

ICDAR 2015

Method Extra Data Precision (%) Recall (%) F-measure (%) FPS (1080Ti) Model
PSENet-1s (ResNet50) - 81.49 79.68 80.57 1.6 baiduyun(extract code: rxti); OneDrive
PSENet-1s (ResNet50) pretrain on IC17 MLT 86.92 84.5 85.69 1.6 baiduyun(extract code: aieo); OneDrive
PSENet-4s (ResNet50) pretrain on IC17 MLT 86.1 83.77 84.92 3.8 baiduyun(extract code: aieo); OneDrive

SCUT-CTW1500

Method Extra Data Precision (%) Recall (%) F-measure (%) FPS (1080Ti) Model
PSENet-1s (ResNet50) - 80.57 75.55 78.0 3.9 baiduyun(extract code: ksv7); OneDrive
PSENet-1s (ResNet50) pretrain on IC17 MLT 84.84 79.73 82.2 3.9 baiduyun(extract code: z7ac); OneDrive
PSENet-4s (ResNet50) pretrain on IC17 MLT 82.09 77.84 79.9 8.4 baiduyun(extract code: z7ac); OneDrive

Performance (old version paper)

ICDAR 2015 (training with ICDAR 2017 MLT)

Method Precision (%) Recall (%) F-measure (%)
PSENet-4s (ResNet152) 87.98 83.87 85.88
PSENet-2s (ResNet152) 89.30 85.22 87.21
PSENet-1s (ResNet152) 88.71 85.51 87.08

ICDAR 2017 MLT

Method Precision (%) Recall (%) F-measure (%)
PSENet-4s (ResNet152) 75.98 67.56 71.52
PSENet-2s (ResNet152) 76.97 68.35 72.40
PSENet-1s (ResNet152) 77.01 68.40 72.45

SCUT-CTW1500

Method Precision (%) Recall (%) F-measure (%)
PSENet-4s (ResNet152) 80.49 78.13 79.29
PSENet-2s (ResNet152) 81.95 79.30 80.60
PSENet-1s (ResNet152) 82.50 79.89 81.17

ICPR MTWI 2018 Challenge 2

Method Precision (%) Recall (%) F-measure (%)
PSENet-1s (ResNet152) 78.5 72.1 75.2

Results

Figure 3: The results on ICDAR 2015, ICDAR 2017 MLT and SCUT-CTW1500

Paper Link

[new version paper] https://arxiv.org/abs/1903.12473

[old version paper] https://arxiv.org/abs/1806.02559

Other Implements

[tensorflow version (thanks @liuheng92)] https://github.com/liuheng92/tensorflow_PSENet

Citation

@inproceedings{wang2019shape,
  title={Shape Robust Text Detection With Progressive Scale Expansion Network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}
A simple component to display annotated text in Streamlit apps.

Annotated Text Component for Streamlit A simple component to display annotated text in Streamlit apps. For example: Installation First install Streaml

Thiago Teixeira 312 Dec 30, 2022
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

hocr-tools About About the code Installation System-wide with pip System-wide from source virtualenv Available Programs hocr-check -- check the hOCR f

OCRopus 285 Dec 08, 2022
Face Detection with DLIB

Face Detection with DLIB In this project, we have detected our face with dlib and opencv libraries. Setup This Project Install DLIB & OpenCV You can i

Can 2 Jan 16, 2022
Shape Detection - It's a shape detection project with OpenCV and Python.

Shape Detection It's a shape detection project with OpenCV and Python. Setup pip install opencv-python for doing AI things. pip install simpleaudio fo

1 Nov 26, 2022
Repository for Scene Text Detection with Supervised Pyramid Context Network with tensorflow.

Scene-Text-Detection-with-SPCNET Unofficial repository for [Scene Text Detection with Supervised Pyramid Context Network][https://arxiv.org/abs/1811.0

121 Oct 15, 2021
PAGE XML format collection for document image page content and more

PAGE-XML PAGE XML format collection for document image page content and more For an introduction, please see the following publication: http://www.pri

PRImA Research Lab 46 Nov 14, 2022
Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

Revan Muhammad Dafa 5 Dec 06, 2021
A tool to enhance your old/damaged pictures built using python & opencv.

Breathe Life into your Old Pictures Table of Contents About The Project Getting Started Prerequisites Usage Contact Acknowledgments About The Project

Shah Anwaar Khalid 5 Dec 16, 2021
This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and flexible design and ready to be integrated right into your system!

Passport-Recogniton-System This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and fle

Mo'men Ashraf Muhamed 7 Jan 04, 2023
Fine tuning keras-ocr python package with custom synthetic dataset from scratch

OCR-Pipeline-with-Keras The keras-ocr package generally consists of two parts: a Detector and a Recognizer: Detector is responsible for creating bound

Eugene 1 Jan 05, 2022
Text language identification using Wikipedia data

Text language identification using Wikipedia data The aim of this project is to provide high-quality language detection over all the web's languages.

Vsevolod Dyomkin 28 Jul 09, 2022
A list of hyperspectral image super-solution resources collected by Junjun Jiang

A list of hyperspectral image super-resolution resources collected by Junjun Jiang. If you find that important resources are not included, please feel free to contact me.

Junjun Jiang 301 Jan 05, 2023
kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

Kaldi 12.3k Jan 05, 2023
Um RPG de texto orientado a objetos.

RPG de texto Um RPG de texto orientado a objetos, sem história. Um RPG (Role-playing game) baseado em texto em que você pode viajar para alguns locais

Vinicius 3 Oct 05, 2022
Convert PDF/Image to TXT using EasyOcr - the best OCR engine available!

PDFImage2TXT - DOWNLOAD INSTALLER HERE What can you do with it? Convert scanned PDFs to TXT. Convert scanned Documents to TXT. No coding required!! In

Hans Alemão 2 Feb 22, 2022
Usando o Amazon Textract como OCR para Extração de Dados no DynamoDB

dio-live-textract2 Repositório de código para o live coding do dia 05/10/2021 sobre extração de dados estruturados e gravação em banco de dados a part

hugoportela 0 Jan 19, 2022
Train custom VR face tracking parameters

Pal Buddy Guy: The anipal's best friend This is a small script to improve upon the tracking capabilities of the Vive Pro Eye and facial tracker. You c

7 Dec 12, 2021
CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

453 Dec 28, 2022
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

Quick Info this library tries to solve language detection of very short words and phrases, even shorter than tweets makes use of both statistical and

Peter M. Stahl 532 Dec 28, 2022
An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive

Kushal Shingote 1 Feb 12, 2022