This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

Overview

EAST: An Efficient and Accurate Scene Text Detector

Description:

This version will be updated soon, please pay attention to this work. The motivation of this version is to build a easy-training model. This version can automatically update best_model by comparing current hmean and the former. At the same time, we can see evaluation info about every sample easily.

  • 1.train
  • 2.predict
  • 3.compress
  • 4.compute Hmean(if Hmean is higher than before, update best_weight.pkl)
  • 5.visualization(blue, green, red)
  • 6.multi-scale test (update soon) multi-scale vis. (vis with score, scales)

Thanks

The version is ported from argman/EAST, from Tensorflow to Pytorch

Check On Website

If you have no confidence of the result of our program, you could use submit.zip to submit on website,then you can see result of every image.

Performance

  • right -- green || wrong -- red || miss -- blue visualization visualization

  • recall/precision/hmean for every test image hmean

Introduction

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector. The features are summarized blow:

  • Only RBOX part is implemented.
  • A fast Locality-Aware NMS in C++ provided by the paper's author.(g++/gcc version 6.0 + will be ok)
  • Evalution see here for the detailed results.
  • Differences from original paper
    • Use ResNet-50 rather than PVANET
    • Use dice loss (optimize IoU of segmentation) rather than balanced cross entropy
    • Use linear learning rate decay rather than staged learning rate decay

Thanks for the author's (@zxytim) help! Please cite his paper if you find this useful.

Contents

  1. Installation
  2. Download
  3. Prepare dataset/pretrain
  4. Test
  5. Train
  6. Examples

Installation

  1. Any version of pytorch version > 0.4.0 should be ok.

Download

  1. Pretrained model is not provided temporarily. Web site is updating now, please continue to pay attention

Prepare dataset/pretrain weight

[1]. dataset(you need to prepare for dataset for train and test) suggestions: you could do a soft-link to root_to_this_program/dataset/train/img/*.jpg

  • -- train ./dataset/train/img/img_###.jpg ./dataset/train/gt/img_###.txt (you need to change name)
  • -- test ./data/test/img_###.jpg (img only)
  • -- gt.zip ./result/gt.zip(ICDAR15 gt.zip is avaliable on website

** Note: you can download dataset here

[2]. pretrained

  • In config.py set resume True and set checkpoint path/to/weight/file
  • I will provide pretrianed weight soon

[3]. check GPUs and CPUs you can use following to check aviliable gpu, this is for train

watch -n 0.1 nvidia-smi

then, you will see 2,3 is avaliable, modify config.py gpu_ids = [0,1], gpu = 2, and modify run.sh - CUDA_VISIBLE_DEVICES=2,3

Train

If you want to train the model, you should provide the dataset path in config.py and run

sh run.py

** Note: you should modify run.sh to specify your gpu id

If you have more than one gpu, you can pass gpu ids to gpu_list(like gpu_list=0,1,2,3) in config.py

** Note: you should change the gt text file of icdar2015's filename to img_*.txt instead of gt_img_*.txt(or you can change the code in icdar.py), and some extra characters should be removed from the file. See the examples in training_samples/**

Test

By default, we set train-eval process into integer. If you want to use eval independently, you can do it by yourself. Any question can contact me.

Examples

Here are some test examples on icdar2015, enjoy the beautiful text boxes! image_1 image_2 image_3 image_4 image_5

Owner
Dejia Song
Computer Vision & Machine Learning
Dejia Song
Play the Namibian game of Owela against a terrible AI. Built using Django and htmx.

Owela Club A Django project for playing the Namibian game of Owela against a dumb AI. Built following the rules described on the Mancala World wiki pa

Adam Johnson 18 Jun 01, 2022
Handwritten Character Recognition using CNN

Handwritten Character Recognition using CNN Problem Definition The main objective of this project is to solve the problem of handwritten character rec

Mohit Kaushik 4 Mar 02, 2022
Application that instantly translates sign-language to letters.

Sign Language Translator Project Description The main purpose of project is translating sign-language to letters. In accordance with this purpose we d

3 Sep 29, 2022
Zoom , GoogleMeets에서 Vtuber 데뷔하기

EasyVtuber Facial landmark와 GAN을 이용한 Character Face Generation Google Meets, Zoom 등에서 자신만의 웹툰, 만화 캐릭터로 대화해보세요! 악세사리는 어느정도 추가해도 잘 작동해요! 안타깝게도 RTX 2070

Gunwoo Han 140 Dec 23, 2022
Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, rastreia padrões de gestos em vez de um mouse físico.

mouserController Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, r

Vinícius Azevedo 6 Jun 28, 2022
Deep LearningImage Captcha 2

滑动验证码深度学习识别 本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口,基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。 只需要几百张缺口标注图片即可训练出精度高的识别模型,识别效果样例: 克隆项目 运行命令: git cl

Python3WebSpider 117 Dec 28, 2022
docstrum

Docstrum Algorithm Getting Started This repo is for developing a Docstrum algorithm presented by O’Gorman (1993). Disclaimer This source code is built

Chulwoo Mike Pack 54 Dec 13, 2022
Msos searcher - A half-hearted attempt at finding a magic square of squares

MSOS searcher A half-hearted attempt at finding (or rather searching) a MSOS (Magic Square of Squares) in the spirit of the Parker Square. Running I r

Niels Mündler 1 Jan 02, 2022
This repository summarized computer vision theories.

This repository summarized computer vision theories.

3 Feb 04, 2022
Random maze generator and solver

Maze Generator and Solver I wrote a maze generator that works with two commonly known algorithms: Depth First Search and Randomized Prims. Both of the

Daniel Pérez 10 Sep 23, 2022
OpenCVを用いたカメラキャリブレーションのサンプルです。2021/06/21時点でPython実装のある3種類(通常カメラ向け、魚眼レンズ向け(fisheyeモジュール)、全方位カメラ向け(omnidirモジュール))について用意しています。

OpenCV-CameraCalibration-Example FishEyeCameraCalibration.mp4 OpenCVを用いたカメラキャリブレーションのサンプルです 2021/06/21時点でPython実装のある以下3種類について用意しています。 通常カメラ向け 魚眼レンズ向け(

KazuhitoTakahashi 34 Nov 17, 2022
virtual mouse which can copy files, close tabs and many other features !

AI Virtual Mouse Controller Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera. Fingertip loca

Diwas Pandey 23 Oct 05, 2021
Image processing is one of the most common term in computer vision

Image processing is one of the most common term in computer vision. Computer vision is the process by which computers can understand images and videos, and how they are stored, manipulated, and retri

Happy N. Monday 3 Feb 15, 2022
ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.

ScanTailor Advanced The ScanTailor version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and f

952 Dec 31, 2022
CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 文本检测:CTPN 文本识别:DenseNet + CTC 环境部署 sh setup.sh 注:CPU环境执行前需注释掉for gpu部分,并解开for cpu部分的注释 Demo 将测试图片放入test_images

Yang Chenguang 2.6k Dec 29, 2022
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 47k Jan 07, 2023
Steve Tu 71 Dec 30, 2022
Code for AAAI 2021 paper: Sequential End-to-end Network for Efficient Person Search

This repository hosts the source code of our paper: [AAAI 2021]Sequential End-to-end Network for Efficient Person Search. SeqNet achieves the state-of

Zj Li 218 Dec 31, 2022
A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

ocrserver Simple OCR server, as a small working sample for gosseract. Try now here https://ocr-example.herokuapp.com/, and deploy your own now. Deploy

Hiromu OCHIAI 541 Dec 28, 2022
"Very simple but works well" Computer Vision based ID verification solution provided by LibraX.

ID Verification by LibraX.ai This is the first free Identity verification in the market. LibraX.ai is an identity verification platform for developers

LibraX.ai 46 Dec 06, 2022