PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Related tags

Computer VisionEAST
Overview

Description

This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector.

  • Only RBOX part is implemented.
  • Using dice loss instead of class-balanced cross-entropy loss. Some codes refer to argman/EAST and songdejia/EAST
  • The pre-trained model provided achieves 82.79 F-score on ICDAR 2015 Challenge 4 using only the 1000 images. see here for the detailed results.
Model Loss Recall Precision F-score
Original CE 72.75 80.46 76.41
Re-Implement Dice 81.27 84.36 82.79

Prerequisites

Only tested on

  • Anaconda3
  • Python 3.7.1
  • PyTorch 1.0.1
  • Shapely 1.6.4
  • opencv-python 4.0.0.21
  • lanms 1.0.2

When running the script, if some module is not installed you will see a notification and installation instructions. if you failed to install lanms, please update gcc and binutils. The update under conda environment is:

conda install -c omgarcia gcc-6
conda install -c conda-forge binutils

The original lanms code has a bug in normalize_poly that the ref vertices are not fixed when looping the p's ordering to calculate the minimum distance. We fixed this bug in LANMS so that anyone could compile the correct lanms. However, this repo still uses the original lanms.

Installation

1. Clone the repo

git clone https://github.com/SakuraRiven/EAST.git
cd EAST

2. Data & Pre-Trained Model

  • Download Train and Test Data: ICDAR 2015 Challenge 4. Cut the data into four parts: train_img, train_gt, test_img, test_gt.

  • Download pre-trained VGG16 from PyTorch: VGG16 and our trained EAST model: EAST. Make a new folder pths and put the download pths into pths

mkdir pths
mv east_vgg16.pth vgg16_bn-6c64b313.pth pths/

Here is an example:

.
├── EAST
│   ├── evaluate
│   └── pths
└── ICDAR_2015
    ├── test_gt
    ├── test_img
    ├── train_gt
    └── train_img

Train

Modify the parameters in train.py and run:

CUDA_VISIBLE_DEVICES=0,1 python train.py

Detect

Modify the parameters in detect.py and run:

CUDA_VISIBLE_DEVICES=0 python detect.py

Evaluate

  • The evaluation scripts are from ICDAR Offline evaluation and have been modified to run successfully with Python 3.7.1.
  • Change the evaluate/gt.zip if you test on other datasets.
  • Modify the parameters in eval.py and run:
CUDA_VISIBLE_DEVICES=0 python eval.py
Owner
I AM IRON MAN
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 47k Jan 07, 2023
Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining

Scene Text Recognition Recommendations Everythin about Scene Text Recognition SOTA • Papers • Datasets • Code Contents 1. Papers 2. Datasets 2.1 Synth

Deep Learning and Vision Computing Lab, SCUT 197 Jan 05, 2023
利用Paddle框架复现CRAFT

CRAFT-Paddle 利用Paddle框架复现CRAFT CRAFT 本项目基于paddlepaddle框架复现CRAFT,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: CRAFT: Character-Region Awarenes

QuanHao Guo 2 Mar 07, 2022
Some bits of javascript to transcribe scanned pages using PageXML

nashi (nasḫī) Some bits of javascript to transcribe scanned pages using PageXML. Both ltr and rtl languages are supported. Try it! But wait, there's m

Andreas Büttner 15 Nov 09, 2022
Scene text detection and recognition based on Extremal Region(ER)

Scene text recognition A real-time scene text recognition algorithm. Our system is able to recognize text in unconstrain background. This algorithm is

HSIEH, YI CHIA 155 Dec 06, 2022
Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera.

Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera. Fingertip location is mapped to RGB images to control the mouse cursor.

Ravi Sharma 71 Dec 20, 2022
PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

News Python3 implementations of PSENet [1], PAN [2] and PAN++ [3] are released at https://github.com/whai362/pan_pp.pytorch. [1] W. Wang, E. Xie, X. L

1.1k Dec 24, 2022
Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

Tencent YouTu Research 146 Dec 24, 2022
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

EAST_ICPR: EAST for ICPR MTWI 2018 CHALLENGE Introduction This is a repository forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE. Origin Reposi

Haozheng Li 157 Aug 23, 2022
Usando o Amazon Textract como OCR para Extração de Dados no DynamoDB

dio-live-textract2 Repositório de código para o live coding do dia 05/10/2021 sobre extração de dados estruturados e gravação em banco de dados a part

hugoportela 0 Jan 19, 2022
Code for the paper "Controllable Video Captioning with an Exemplar Sentence"

SMCG Code for the paper "Controllable Video Captioning with an Exemplar Sentence" Introduction We investigate a novel and challenging task, namely con

10 Dec 04, 2022
Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream vid

Peace 10 Jun 30, 2021
A curated list of papers, code and resources pertaining to image composition

A curated list of resources including papers, datasets, and relevant links pertaining to image composition.

BCMI 391 Dec 30, 2022
Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that slide and lock together.

Fusion-360-Add-In-PuzzleSpline Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that sli

Michiel van Wessem 1 Nov 15, 2021
DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

Kwai 3.1k Jan 05, 2023
MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI. It is an open-source and easy-to-install ecosystem that can run locally on a machine with one

Project MONAI 344 Dec 23, 2022
CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

Fu-En Wang 83 Jan 04, 2023
A Python wrapper for Google Tesseract

Python Tesseract Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and "read" the text embedded i

Matthias A Lee 4.6k Jan 06, 2023
A fastai/PyTorch package for unpaired image-to-image translation.

Unpaired image-to-image translation A fastai/PyTorch package for unpaired image-to-image translation currently with CycleGAN implementation. This is a

Tanishq Abraham 120 Dec 02, 2022