CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

Overview

CRAFT-Reimplementation

Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 .

Reimplementation:Character Region Awareness for Text Detection Reimplementation based on Pytorch

Character Region Awareness for Text Detection

Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee (Submitted on 3 Apr 2019)

The full paper is available at: https://arxiv.org/pdf/1904.01941.pdf

Install Requirements:

1、PyTroch>=0.4.1
2、torchvision>=0.2.1
3、opencv-python>=3.4.2
4、check requiremtns.txt
5、4 nvidia GPUs(we use 4 nvidia titanX)

pre-trained model:

NOTE: There are old pre-trained models, I will upload the new results pre-trained models' link.
Syndata:Syndata for baidu drive || Syndata for google drive
Syndata+IC15:Syndata+IC15 for baidu drive || Syndata+IC15 for google drive
Syndata+IC13+IC17:Syndata+IC13+IC17 for baidu drive|| Syndata+IC13+IC17 for google drive

Training

Note: When you train the IC15-Data or MLT-Data, please see the annotation in data_loader.py line 92 and line 108-112.

Train for Syndata

  • download the Syndata(I will give the link)
  • change the path in basernet/vgg16_bn.py file:

(/data/CRAFT-pytorch/vgg16_bn-6c64b313.pth -> /your_path/vgg16_bn-6c64b313.pth).You can download the model here.baidu||google

  • change the path in trainSyndata.py file:

(1、/data/CRAFT-pytorch/SynthText -> /your_path/SynthText 2、/data/CRAFT-pytorch/synweights/synweights -> /your_path/real_weights)

  • Run python trainSyndata.py

Train for IC15 data based on Syndata pre-trained model

  • download the IC15 data, rename the image file and the gt file for ch4_training_images and ch4_training_localization_transcription_gt,respectively.
  • change the path in basernet/vgg16_bn.py file:

(/data/CRAFT-pytorch/vgg16_bn-6c64b313.pth -> /your_path/vgg16_bn-6c64b313.pth).You can download the model here.baidu||google

  • change the path in trainic15data.py file:

(1、/data/CRAFT-pytorch/SynthText -> /your_path/SynthText 2、/data/CRAFT-pytorch/real_weights -> /your_path/real_weights)

  • change the path in trainic15data.py file:

(1、/data/CRAFT-pytorch/1-7.pth -> /your_path/your_pre-trained_model_name 2、/data/CRAFT-pytorch/icdar1317 -> /your_ic15data_path/)

  • Run python trainic15data.py

Train for IC13+17 data based on Syndata pre-trained model

  • download the MLT data, rename the image file and the gt file,respectively.
  • change the path in basernet/vgg16_bn.py file:

(/data/CRAFT-pytorch/vgg16_bn-6c64b313.pth -> /your_path/vgg16_bn-6c64b313.pth).You can download the model here.baidu||google

  • change the path in trainic-MLT_data.py file:

(1、/data/CRAFT-pytorch/SynthText -> /your_path/SynthText 2、savemodel path-> your savemodel path)

  • change the path in trainic-MLT_data.py file:

(1、/data/CRAFT-pytorch/1-7.pth -> /your_path/your_pre-trained_model_name 2、/data/CRAFT-pytorch/icdar1317 -> /your_ic15data_path/)

  • Run python trainic-MLT_data.py

If you want to train for weak supervised use our Syndate pre-trained model:

1、You should first download the pre_trained model trained in the Syndata baidu||google.
2、change the data path and pre-trained model path.
3、run python trainic15data.py

This code supprts for Syndata and icdar2015, and we will release the training code for IC13 and IC17 as soon as possible.

Methods dataset Recall precision H-mean
Syndata ICDAR13 71.93% 81.31% 76.33%
Syndata+IC15 ICDAR15 76.12% 84.55% 80.11%
Syndata+MLT(deteval) ICDAR13 86.81% 95.28% 90.85%
Syndata+MLT(deteval)(new gaussian map method) ICDAR13 90.67% 94.56% 92.57%
Syndata+IC15(new gaussian map method) ICDAR15 80.36% 84.25% 82.26%

We have released the latest code with new gaussian map and random crop algorithm.

Note:new gaussian map method can split the inference gaussian region score map
Sample:

Note:We have solved the problem about detecting big word. Now we are training the model. And any issues or advice are welcome.

Sample:

###weChat QR code

Contributing to the project

We will release training code as soon as possible, and we have not yet reached the results given in the author's paper. Any pull requests or issues are welcome. We also hope that you could give us some advice for the project.

Acknowledgement

Thanks for Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee excellent work and code for test. In this repo, we use the author repo's basenet and test code.

License

For commercial use, please contact us.

Fun program to overlay a mask to yourself using a webcam

Superhero Mask Overlay Description Simple project made for fun. It consists of placing a mask (a PNG image with transparent background) on your face.

KB Kwan 10 Dec 01, 2022
This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

alebogado 1 Jan 27, 2022
基于Paddle框架的PSENet复现

PSENet-Paddle 基于Paddle框架的PSENet复现 本项目基于paddlepaddle框架复现PSENet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 AIStudio链接 参考项目: whai362-PSENet 环境配置 本项目

QuanHao Guo 4 Apr 24, 2022
governance proposal to make fei redeemable for eth

Feil Proposal 🌲 Abstract Migrate all ETH from Fei protocol-controlled value into Yearn ETH Vault. Allow redemptions of outstanding FEI for yvETH. At

13 Mar 31, 2022
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Convolutional Recurrent Neural Network This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC l

Baoguang Shi 2k Dec 31, 2022
Deep learning based page layout analysis

Deep Learning Based Page Layout Analyze This is a Python implementaion of page layout analyze tool. The goal of page layout analyze is to segment page

186 Dec 29, 2022
docstrum

Docstrum Algorithm Getting Started This repo is for developing a Docstrum algorithm presented by O’Gorman (1993). Disclaimer This source code is built

Chulwoo Mike Pack 54 Dec 13, 2022
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless. This is the official Roboflow python package that interfaces with the Roboflow API.

Roboflow 52 Dec 23, 2022
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

Ed Medvedev 933 Dec 29, 2022
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

Jaided AI 16.7k Jan 03, 2023
Code for paper "Role-based network embedding via structural features reconstruction with degree-regularized constraint"

Role-based network embedding via structural features reconstruction with degree-regularized constraint Train python main.py --dataset brazil-flights

wang zhang 1 Jun 28, 2022
Perspective recovery of text using transformed ellipses

unproject_text Perspective recovery of text using transformed ellipses. See full writeup at https://mzucker.github.io/2016/10/11/unprojecting-text-wit

Matt Zucker 111 Nov 13, 2022
Ddddocr - 通用验证码识别OCR pypi版

带带弟弟OCR通用验证码识别SDK免费开源版 今天ddddocr又更新啦! 当前版本为1.3.1 想必很多做验证码的新手,一定头疼碰到点选类型的图像,做样本费时

Sml2h3 4.4k Dec 31, 2022
MXNet OCR implementation. Including text recognition and detection.

insightocr Text Recognition Accuracy on Chinese dataset by caffe-ocr Network LSTM 4x1 Pooling Gray Test Acc SimpleNet N Y Y 99.37% SE-ResNet34 N Y Y 9

Deep Insight 99 Nov 01, 2022
Document blur detection based on Laplacian operator and text detection.

Document Blur Detection For general blurred image, using the variance of Laplacian operator is a good solution. But as for the blur detection of docum

JoeyLr 5 Oct 20, 2022
天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 - 第三名解决方案

天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 比赛链接 个人博客记录 目录结构 ├── final------------------------------------决赛方案PPT ├── preliminary_contest--------------------

19 Aug 17, 2022
An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss This is an unofficial implementation of AutoVC based on the official one. The reposi

Chien-yu Huang 27 Jun 16, 2022
Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, rastreia padrões de gestos em vez de um mouse físico.

mouserController Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, r

Vinícius Azevedo 6 Jun 28, 2022
Volume Control using OpenCV

Gesture-Volume-Control Volume Control using OpenCV Here i made volume control using Python and OpenCV in which we can control the volume of our laptop

Mudit Sinha 3 Oct 10, 2021