PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Related tags

Computer VisionEAST
Overview

Description

This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector.

  • Only RBOX part is implemented.
  • Using dice loss instead of class-balanced cross-entropy loss. Some codes refer to argman/EAST and songdejia/EAST
  • The pre-trained model provided achieves 82.79 F-score on ICDAR 2015 Challenge 4 using only the 1000 images. see here for the detailed results.
Model Loss Recall Precision F-score
Original CE 72.75 80.46 76.41
Re-Implement Dice 81.27 84.36 82.79

Prerequisites

Only tested on

  • Anaconda3
  • Python 3.7.1
  • PyTorch 1.0.1
  • Shapely 1.6.4
  • opencv-python 4.0.0.21
  • lanms 1.0.2

When running the script, if some module is not installed you will see a notification and installation instructions. if you failed to install lanms, please update gcc and binutils. The update under conda environment is:

conda install -c omgarcia gcc-6
conda install -c conda-forge binutils

The original lanms code has a bug in normalize_poly that the ref vertices are not fixed when looping the p's ordering to calculate the minimum distance. We fixed this bug in LANMS so that anyone could compile the correct lanms. However, this repo still uses the original lanms.

Installation

1. Clone the repo

git clone https://github.com/SakuraRiven/EAST.git
cd EAST

2. Data & Pre-Trained Model

  • Download Train and Test Data: ICDAR 2015 Challenge 4. Cut the data into four parts: train_img, train_gt, test_img, test_gt.

  • Download pre-trained VGG16 from PyTorch: VGG16 and our trained EAST model: EAST. Make a new folder pths and put the download pths into pths

mkdir pths
mv east_vgg16.pth vgg16_bn-6c64b313.pth pths/

Here is an example:

.
├── EAST
│   ├── evaluate
│   └── pths
└── ICDAR_2015
    ├── test_gt
    ├── test_img
    ├── train_gt
    └── train_img

Train

Modify the parameters in train.py and run:

CUDA_VISIBLE_DEVICES=0,1 python train.py

Detect

Modify the parameters in detect.py and run:

CUDA_VISIBLE_DEVICES=0 python detect.py

Evaluate

  • The evaluation scripts are from ICDAR Offline evaluation and have been modified to run successfully with Python 3.7.1.
  • Change the evaluate/gt.zip if you test on other datasets.
  • Modify the parameters in eval.py and run:
CUDA_VISIBLE_DEVICES=0 python eval.py
Owner
I AM IRON MAN
Learn computer graphics by writing GPU shaders!

This repo contains a selection of projects designed to help you learn the basics of computer graphics. We'll be writing shaders to render interactive two-dimensional and three-dimensional scenes.

Eric Zhang 1.9k Jan 02, 2023
This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

This is an oriented object detector based on tensorflow object detection API. Most of the code is not changed except for those related to the need of

Dafang He 30 Oct 22, 2022
Memory tests solver with using OpenCV

Human Benchmark project This project is OpenCV based programs which are puzzle solvers for 7 different games for https://humanbenchmark.com/. made as

Bahadır Araz 24 Dec 27, 2022
Using Opencv ,based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching

Using Opencv ,this project is based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching ,it will just mask that image . This project ,if used in cctv

1 Feb 13, 2022
Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

Sergio Díaz Fernández 1 Jan 13, 2022
A python program to block out your face

Readme This is a small program I threw together in about 6 hours to block out your face. It probably doesn't work very well, so be warned. By default,

1 Oct 17, 2021
Creating a virtual tv using opencv in python3.

Virtual-TV Creating a virtual tv using opencv in python3. In order to run the code follow the below given steps: Make sure the desired videos which ar

Vamsi 1 Jan 01, 2022
In this project we will be using the live feed coming from the webcam to create a virtual mouse with complete functionalities.

Virtual Mouse Using OpenCV In this project we will be using the live feed coming from the webcam to create a virtual mouse using hand tracking. Projec

Hassan Shahzad 8 Dec 20, 2022
An interactive interface for using OpenCV's GrabCut algorithm for image segmentation.

Interactive GrabCut An interactive interface for using OpenCV's GrabCut algorithm for image segmentation. Setup Install dependencies: pip install nump

Jason Y. Zhang 16 Oct 10, 2022
docstrum

Docstrum Algorithm Getting Started This repo is for developing a Docstrum algorithm presented by O’Gorman (1993). Disclaimer This source code is built

Chulwoo Mike Pack 54 Dec 13, 2022
Framework for the Complete Gaze Tracking Pipeline

Framework for the Complete Gaze Tracking Pipeline The figure below shows a general representation of the camera-to-screen gaze tracking pipeline [1].

Pascal 20 Jan 06, 2023
Document blur detection based on Laplacian operator and text detection.

Document Blur Detection For general blurred image, using the variance of Laplacian operator is a good solution. But as for the blur detection of docum

JoeyLr 5 Oct 20, 2022
How to detect objects in real time by using Jupyter Notebook and Neural Networks , by using Yolo3

Real Time Object Recognition From your Screen Desktop . In this post, I will explain how to build a simply program to detect objects from you desktop

Ruslan Magana Vsevolodovna 2 Sep 28, 2022
A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports "with"-syntax.

Patrice Matz 0 Oct 30, 2021
⛓ marc is a small, but flexible Markov chain generator

About marc (markov chain) is a small, but flexible Markov chain generator. Usage marc is easy to use. To build a MarkovChain pass the object a sequenc

Max Humber 65 Oct 27, 2022
Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Total-Text-Dataset (Official site) Updated on April 29, 2020 (Detection leaderboard is updated - highlighted E2E methods. Thank you shine-lcy.) Update

Chee Seng Chan 671 Dec 27, 2022
A webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV.

Qbr Qbr, pronounced as Cuber, is a webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV. 🌈 Accurate color detection 🔍 Accurate 3x3x

Kim 金可明 502 Dec 29, 2022
Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition

CRNN_Tensorflow This is a TensorFlow implementation of a Deep Neural Network for scene text recognition. It is mainly based on the paper "An End-to-En

MaybeShewill-CV 1000 Dec 27, 2022
Code for AAAI 2021 paper: Sequential End-to-end Network for Efficient Person Search

This repository hosts the source code of our paper: [AAAI 2021]Sequential End-to-end Network for Efficient Person Search. SeqNet achieves the state-of

Zj Li 218 Dec 31, 2022
Handwritten Text Recognition (HTR) using TensorFlow 2.x

Handwritten Text Recognition (HTR) system implemented using TensorFlow 2.x and trained on the Bentham/IAM/Rimes/Saint Gall/Washington offline HTR data

Arthur Flôr 160 Dec 21, 2022