Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

Overview

CRAFT: Character-Region Awareness For Text detection

Downloads PyPI version Conda version CI

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

Overview

PyTorch implementation for CRAFT text detector that effectively detect text area by exploring each character region and affinity between characters. The bounding box of texts are obtained by simply finding minimum bounding rectangles on binary map after thresholding character region and affinity scores.

teaser

Getting started

Installation

  • Install using conda for Linux, Mac and Windows (preferred):
conda install -c fcakyon craft-text-detector
  • Install using pip for Linux and Mac:
pip install craft-text-detector

Basic Usage

# import Craft class
from craft_text_detector import Craft

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# create a craft instance
craft = Craft(output_dir=output_dir, crop_type="poly", cuda=False)

# apply craft text detection and export detected regions to output directory
prediction_result = craft.detect_text(image_path)

# unload models from ram/gpu
craft.unload_craftnet_model()
craft.unload_refinenet_model()

Advanced Usage

# import craft functions
from craft_text_detector import (
    read_image,
    load_craftnet_model,
    load_refinenet_model,
    get_prediction,
    export_detected_regions,
    export_extra_results,
    empty_cuda_cache
)

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# read image
image = read_image(image_path)

# load models
refine_net = load_refinenet_model(cuda=True)
craft_net = load_craftnet_model(cuda=True)

# perform prediction
prediction_result = get_prediction(
    image=image,
    craft_net=craft_net,
    refine_net=refine_net,
    text_threshold=0.7,
    link_threshold=0.4,
    low_text=0.4,
    cuda=True,
    long_size=1280
)

# export detected text regions
exported_file_paths = export_detected_regions(
    image_path=image_path,
    image=image,
    regions=prediction_result["boxes"],
    output_dir=output_dir,
    rectify=True
)

# export heatmap, detection points, box visualization
export_extra_results(
    image_path=image_path,
    image=image,
    regions=prediction_result["boxes"],
    heatmaps=prediction_result["heatmaps"],
    output_dir=output_dir
)

# unload models from gpu
empty_cuda_cache()
You might also like...
TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法,textBoxes_note记录了之前整理的笔记。

TextBoxes: A Fast Text Detector with a Single Deep Neural Network Introduction This paper presents an end-to-end trainable fast scene text detector, n

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.
Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

DataTuner You have just found the DataTuner. This repository provides tools for fine-tuning language models for a task. See LICENSE.txt for license de

This can be use to convert text in a file to handwritten text.

TextToHandwriting This can be used to convert text to handwriting. Clone this project or download the code. Run TextToImage.py give the filename of th

python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

Augmenting Anchors by the Detector Itself
Augmenting Anchors by the Detector Itself

Augmenting Anchors by the Detector Itself Introduction It is difficult to determine the scale and aspect ratio of anchors for anchor-based object dete

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications
Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

Comments
  • Add more options for detect_text method

    Add more options for detect_text method

    Hi, sometime I don't want detect_text from file, I want detect_text directly from image in ndarray format, that will save more cost of I/O time. So I contribute this. Thanks for your work

    opened by ducviet00 2
  • Enable package to load model from local path

    Enable package to load model from local path

    When using the pypi package it should be allowed to use a model from a local path, because loading it from a remote location removes the control over what model is currently used. And might also result in pull limits being reached.

    enhancement 
    opened by TanjaBayer 1
  • Fix #8 - Fixing cuda issues in basic usage text detection

    Fix #8 - Fixing cuda issues in basic usage text detection

    Fixing issue #8

    In this quick-fix I referenced craft_net as a global variable. If this is not an acceptable workaround, then consider reorganizing the structure of the code.

    Have a nice day :)

    opened by gaborpelesz 1
  • accept customized weights path when loading models

    accept customized weights path when loading models

    path for the weight file can be specified by:

    load_craftnet_model(weight_path="path/to/weight")
    
    load_refinenet_model(weight_path="path/to/weight")
    
    opened by fcakyon 0
Releases(0.4.3)
  • 0.4.3(May 9, 2022)

    What's Changed

    • Enable package to load model from local path by @TanjaBayer in https://github.com/fcakyon/craft-text-detector/pull/53

    New Contributors

    • @TanjaBayer made their first contribution in https://github.com/fcakyon/craft-text-detector/pull/53

    Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.2...0.4.3

    Source code(tar.gz)
    Source code(zip)
  • 0.4.2(Jan 6, 2022)

    What's Changed

    • fix opencv version by @fcakyon in https://github.com/fcakyon/craft-text-detector/pull/48

    Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.1...0.4.2

    Source code(tar.gz)
    Source code(zip)
  • 0.4.1(Dec 20, 2021)

    What's Changed

    • fix crop export by @fcakyon in https://github.com/fcakyon/craft-text-detector/pull/45

    Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.0...0.4.1

    Source code(tar.gz)
    Source code(zip)
  • 0.4.0(Jul 30, 2021)

  • 0.3.5(May 12, 2021)

  • 0.3.4(Apr 7, 2021)

    • add support for PIL and numpy images in addition to filepath. https://github.com/fcakyon/craft-text-detector/pull/28
    from PIL import Image
    import numpy
    
    # can be filepath, PIL image or numpy array
    image = 'figures/idcard.png' 
    image = Image.open("figures/idcard.png")
    image = numpy.array(Image.open("figures/idcard.png"))
    
    # apply craft text detection
    prediction_result = craft.detect_text(image)
    Source code(tar.gz)
    Source code(zip)
  • 0.3.3(Mar 2, 2021)

  • 0.3.2(Mar 2, 2021)

    path for the weight file can be specified by:

    load_craftnet_model(weight_path="path/to/weight")
    
    load_refinenet_model(weight_path="path/to/weight")
    
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(May 14, 2020)

    • updated basic usage for better device handling, now Craft instance should be created before calling detect_text:
    # import Craft class
    from craft_text_detector import Craft
    
    # set image path and export folder directory
    image_path = 'figures/idcard.png'
    output_dir = 'outputs/'
    
    # create a craft instance
    craft = Craft(output_dir=output_dir, crop_type="poly", cuda=False)
    
    # apply craft text detection and export detected regions to output directory
    prediction_result = craft.detect_text(image_path)
    
    # unload models from ram/gpu
    craft.unload_craftnet_model()
    craft.unload_refinenet_model()
    
    • some internal naming and styling changes
    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(May 10, 2020)

  • v0.2.0a(Apr 22, 2020)

  • v0.2.0(Apr 22, 2020)

Owner
Senior Machine Learning Engineer, METU & Bilkent alum.
Comparison-of-OCR (KerasOCR, PyTesseract,EasyOCR)

Optical Character Recognition OCR (Optical Character Recognition) is a technology that enables the conversion of document types such as scanned paper

21 Dec 25, 2022
Optical character recognition for Japanese text, with the main focus being Japanese manga

Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran

Maciej Budyś 327 Jan 01, 2023
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

Zuming Huang 363 Jan 03, 2023
Erosion and dialation using structure element in OpenCV python

Erosion and dialation using structure element in OpenCV python

Tamzid hasan 2 Nov 11, 2021
A real-time dolly zoom camera effect

Dolly-Zoom I've always been amazed by the gradual perspective change of dolly zoom, and I have some experience in python and OpenCV, so I decided to c

Dylan Kai Lau 52 Dec 08, 2022
Provides OCR (Optical Character Recognition) services through web applications

OCR4all As suggested by the name one of the main goals of OCR4all is to allow basically any given user to independently perform OCR on a wide variety

174 Dec 31, 2022
Drowsiness Detection and Alert System

A countless number of people drive on the highway day and night. Taxi drivers, bus drivers, truck drivers, and people traveling long-distance suffer from lack of sleep.

Astitva Veer Garg 4 Aug 01, 2022
Polaris is a Face recognition attendance system .

Support Me 🚀 About Polaris 📄 Polaris is a system based on facial recognition with a futuristic GUI design, Can easily find people informations store

XN3UR0N 215 Dec 26, 2022
MXNet OCR implementation. Including text recognition and detection.

insightocr Text Recognition Accuracy on Chinese dataset by caffe-ocr Network LSTM 4x1 Pooling Gray Test Acc SimpleNet N Y Y 99.37% SE-ResNet34 N Y Y 9

Deep Insight 99 Nov 01, 2022
[python3.6] 运用tf实现自然场景文字检测,keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别

本文基于tensorflow、keras/pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别 update20190706 为解决本项目中对数学公式预测的准确性,做了其他的改进和尝试,效果还不错,https://github.com/xiaofengShi/Image2Katex 希

xiaofeng 2.7k Dec 25, 2022
SRA's seminar on Introduction to Computer Vision Fundamentals

Introduction to Computer Vision This repository includes basics to : Python Numpy: A python library Git Computer Vision. The aim of this repository is

Society of Robotics and Automation 147 Dec 04, 2022
learn how to use Gesture Control to change the volume of a computer

Volume-Control-using-gesture In this project we are going to learn how to use Gesture Control to change the volume of a computer. We first look into h

Diwas Pandey 49 Sep 22, 2022
Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

This is the official implementation of "Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation". For more details, please

Pengyuan Lyu 309 Dec 06, 2022
A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

Home-Security-Demo A facial recognition program that plays a alarm (mp3 file) when a person is seen in the room. A basic theif using Python and OpenCV

SysKey 4 Nov 02, 2021
Fully-automated scripts for collecting AI-related papers

AI-Paper-Collector Web demo: https://ai-paper-collector.vercel.app/ (recommended) Colab notebook: here Motivation Fully-automated scripts for collecti

772 Dec 30, 2022
The papers published in top-tier AI conferences in recent years.

AI-conference-papers The papers published in top-tier AI conferences in recent years. Paper table AAAI ICLR CVPR ICML ICCV ECCV NIPS 2019 ✔️ ✔️ ✔️ ✔️

Jinbae Park 6 Dec 09, 2022
Code for paper "Role-based network embedding via structural features reconstruction with degree-regularized constraint"

Role-based network embedding via structural features reconstruction with degree-regularized constraint Train python main.py --dataset brazil-flights

wang zhang 1 Jun 28, 2022
document image degradation

ocrodeg The ocrodeg package is a small Python library implementing document image degradation for data augmentation for handwriting recognition and OC

NVIDIA Research Projects 134 Nov 18, 2022
Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

27 Jan 08, 2023