PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

Related tags

Computer VisionPSENet
Overview

News

#f03c15 Python3 implementations of PSENet [1], PAN [2] and PAN++ [3] are released at https://github.com/whai362/pan_pp.pytorch.

[1] W. Wang, E. Xie, X. Li, W. Hou, T. Lu, G. Yu, and S. Shao. Shape robust text detection with progressive scale expansion network. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 9336–9345, 2019.
[2] W. Wang, E. Xie, X. Song, Y. Zang, W. Wang, T. Lu, G. Yu, and C. Shen. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In Proc. IEEE Int. Conf. Comp. Vis., pages 8440–8449, 2019.
[3] Paper is in preparation.

Shape Robust Text Detection with Progressive Scale Expansion Network

Requirements

  • Python 2.7
  • PyTorch v0.4.1+
  • pyclipper
  • Polygon2
  • OpenCV 3.4 (for c++ version pse)
  • opencv-python 3.4

Introduction

Progressive Scale Expansion Network (PSENet) is a text detector which is able to well detect the arbitrary-shape text in natural scene.

Training

CUDA_VISIBLE_DEVICES=0,1,2,3 python train_ic15.py

Testing

CUDA_VISIBLE_DEVICES=0 python test_ic15.py --scale 1 --resume [path of model]

Eval script for ICDAR 2015 and SCUT-CTW1500

cd eval
sh eval_ic15.sh
sh eval_ctw1500.sh

Performance (new version paper)

ICDAR 2015

Method Extra Data Precision (%) Recall (%) F-measure (%) FPS (1080Ti) Model
PSENet-1s (ResNet50) - 81.49 79.68 80.57 1.6 baiduyun(extract code: rxti); OneDrive
PSENet-1s (ResNet50) pretrain on IC17 MLT 86.92 84.5 85.69 1.6 baiduyun(extract code: aieo); OneDrive
PSENet-4s (ResNet50) pretrain on IC17 MLT 86.1 83.77 84.92 3.8 baiduyun(extract code: aieo); OneDrive

SCUT-CTW1500

Method Extra Data Precision (%) Recall (%) F-measure (%) FPS (1080Ti) Model
PSENet-1s (ResNet50) - 80.57 75.55 78.0 3.9 baiduyun(extract code: ksv7); OneDrive
PSENet-1s (ResNet50) pretrain on IC17 MLT 84.84 79.73 82.2 3.9 baiduyun(extract code: z7ac); OneDrive
PSENet-4s (ResNet50) pretrain on IC17 MLT 82.09 77.84 79.9 8.4 baiduyun(extract code: z7ac); OneDrive

Performance (old version paper)

ICDAR 2015 (training with ICDAR 2017 MLT)

Method Precision (%) Recall (%) F-measure (%)
PSENet-4s (ResNet152) 87.98 83.87 85.88
PSENet-2s (ResNet152) 89.30 85.22 87.21
PSENet-1s (ResNet152) 88.71 85.51 87.08

ICDAR 2017 MLT

Method Precision (%) Recall (%) F-measure (%)
PSENet-4s (ResNet152) 75.98 67.56 71.52
PSENet-2s (ResNet152) 76.97 68.35 72.40
PSENet-1s (ResNet152) 77.01 68.40 72.45

SCUT-CTW1500

Method Precision (%) Recall (%) F-measure (%)
PSENet-4s (ResNet152) 80.49 78.13 79.29
PSENet-2s (ResNet152) 81.95 79.30 80.60
PSENet-1s (ResNet152) 82.50 79.89 81.17

ICPR MTWI 2018 Challenge 2

Method Precision (%) Recall (%) F-measure (%)
PSENet-1s (ResNet152) 78.5 72.1 75.2

Results

Figure 3: The results on ICDAR 2015, ICDAR 2017 MLT and SCUT-CTW1500

Paper Link

[new version paper] https://arxiv.org/abs/1903.12473

[old version paper] https://arxiv.org/abs/1806.02559

Other Implements

[tensorflow version (thanks @liuheng92)] https://github.com/liuheng92/tensorflow_PSENet

Citation

@inproceedings{wang2019shape,
  title={Shape Robust Text Detection With Progressive Scale Expansion Network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}
This is a real life mario project using python and mediapipe

real-life-mario This is a real life mario project using python and mediapipe How to run to run this just run - realMario.py file requirements This req

Programminghut 42 Dec 22, 2022
Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

SSTDNet Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight' using pytorch. This code is work for general object detecti

HotaekHan 84 Jan 05, 2022
Generates a message from the infamous Jerma Impostor image

Generate your very own jerma sus imposter message. Modes: Default Mode: Only supports the characters " ", !, a, b, c, d, e, h, i, m, n, o, p, q, r, s,

Giorno420 1 Oct 27, 2022
An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

CV Lab @ Yonsei University 30 Nov 05, 2022
computer vision, image processing and machine learning on the web browser or node.

Image processing and Machine learning labs   computer vision, image processing and machine learning on the web browser or node note Fast Fourier Trans

ryohei tanaka 487 Nov 11, 2022
A tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background.

EasyLaMa (WIP) This is a tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background. Installation For GP

3 Sep 17, 2022
Python rubik's cube solver

This program makes a 3D representation of a rubiks cube and solves it step by step.

Pablo QB 4 May 29, 2022
Amazing 3D explosion animation using Pygame module.

3D Explosion Animation 💣 💥 🔥 Amazing explosion animation with Pygame. 💣 Explosion physics An Explosion instance is made of a set of Particle objec

Dylan Tintenfich 12 Mar 11, 2022
一键翻译各类图片内文字

一键翻译各类图片内文字 针对群内、各个图站上大量不太可能会有人去翻译的图片设计,让我这种日语小白能够勉强看懂图片 主要支持日语,不过也能识别汉语和小写英文 支持简单的涂白和嵌字

574 Dec 28, 2022
Repositório para registro de estudo da biblioteca opencv (Python)

OpenCV (Python) Objetivo do Repositório: Registrar avanços no estudo da biblioteca opencv. O repositório estará aberto a qualquer pessoa e há tambem u

1 Jun 14, 2022
An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports.

Optical_Character_Recognition An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports. As an IOT/Compute

Ramsis Hammadi 1 Feb 12, 2022
Using computer vision method to recognize and calcutate the features of the architecture.

building-feature-recognition In this repository, we accomplished building feature recognition using traditional/dl-assisted computer vision method. Th

4 Aug 11, 2022
This repo contains a script that allows us to find range of colors in images using openCV, and then convert them into geo vectors.

Vectorizing color range This repo contains a script that allows us to find range of colors in images using openCV, and then convert them into geo vect

Development Seed 9 Jul 27, 2022
Single Shot Text Detector with Regional Attention

Single Shot Text Detector with Regional Attention Introduction SSTD is initially described in our ICCV 2017 spotlight paper. A third-party implementat

Pan He 215 Dec 07, 2022
Some bits of javascript to transcribe scanned pages using PageXML

nashi (nasḫī) Some bits of javascript to transcribe scanned pages using PageXML. Both ltr and rtl languages are supported. Try it! But wait, there's m

Andreas Büttner 15 Nov 09, 2022
Python Computer Vision Aim Bot for Roblox's Phantom Forces

Python-Phantom-Forces-Aim-Bot Python Computer Vision Aim Bot for Roblox's Phanto

drag0ngam3s 2 Jul 11, 2022
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

213 Nov 12, 2022
Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from

Jainam Shah 243 Dec 30, 2022
learn how to use Gesture Control to change the volume of a computer

Volume-Control-using-gesture In this project we are going to learn how to use Gesture Control to change the volume of a computer. We first look into h

Diwas Pandey 49 Sep 22, 2022
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

SynthText Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Ved

Ankush Gupta 1.8k Dec 28, 2022