Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Related tags

Computer VisionRRPN
Overview

Paper source

Arbitrary-Oriented Scene Text Detection via Rotation Proposals

https://arxiv.org/abs/1703.01086

News

We update RRPN in pytorch 1.0! View https://github.com/mjq11302010044/RRPN_plusplus for more details. Text Spotter f-measure results are 89.5 % in IC15, 92.0% in IC13. The testing speed can reach 13.3 fps in IC13 with input shorter size of 640px !

License

RRPN is released under the MIT License (refer to the LICENSE file for details). This project is for research purpose only, further use for RRPN should contact authors.

Citing RRPN

If you find RRPN useful in your research, please consider citing:

@article{Jianqi17RRPN,
    Author = {Jianqi Ma and Weiyuan Shao and Hao Ye and Li Wang and Hong Wang and Yingbin Zheng and Xiangyang Xue},
    Title = {Arbitrary-Oriented Scene Text Detection via Rotation Proposals},
    journal = {IEEE Transactions on Multimedia},
    volume={20}, 
    number={11}, 
    pages={3111-3122}, 
    year={2018}
}

Contents

  1. Requirements: software
  2. Requirements: hardware
  3. Basic installation
  4. Demo
  5. Beyond the demo: training and testing

Requirements: software

  1. Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers!

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# Unrelatedly, it's also recommended that you use CUDNN
USE_CUDNN := 1

You can download my Makefile.config for reference. 2. Python packages you might not have: cython, python-opencv, easydict

Requirements: hardware

  1. For training the end-to-end version of RRPN with VGG16, 4~5G of GPU memory is sufficient (using CUDNN)

Installation (sufficient for the demo)

  1. Clone the RRPN repository
# git clone https://github.com/mjq11302010044/RRPN.git
  1. We'll call the directory that you cloned RRPN into RRPN_ROOT

  2. Build the Cython modules

    cd $RRPN_ROOT/lib
    make
  3. Build Caffe and pycaffe

    cd $RRPN_ROOT/caffe-fast-rcnn
    # Now follow the Caffe installation instructions here:
    #   http://caffe.berkeleyvision.org/installation.html
    
    # If you're experienced with Caffe and have all of the requirements installed
    # and your Makefile.config in place, then simply do:
    make -j4 && make pycaffe
  4. Download pre-computed RRPN detectors

    Trained VGG16 model download link: https://drive.google.com/open?id=0B5rKZkZodGIsV2RJUjVlMjNOZkE
    

    Then move the model into $RRPN_ROOT/data/faster_rcnn_models.

Demo

After successfully completing basic installation, you'll be ready to run the demo.

To run the demo

cd $RRPN_ROOT
python ./tools/rotation_demo.py

The txt results will be saved in $RRPN_ROOT/result

Beyond the demo: installation for training and testing models

You can use the function get_rroidb() in $RRPN_ROOT/lib/rotation/data_extractor.py to manage your training data:

Each training sample should be managed in a python dict like:

im_info = {
	'gt_classes': # Set to 1(Only text)
	'max_classes': # Set to 1(Only text)
	'image': # image path to access
	'boxes': # ground truth box
	'flipped' : # Flip an image or not (Not implemented)
	'gt_overlaps' : # overlap of a class(text)
	'seg_areas' : # area of an ground truth region
	'height': # height of an image data
	'width': # width of an image data
	'max_overlaps' : # max overlap with each gt-proposal
	'rotated': # Random angle to rotate an image
}

Then assign your database to the variable 'roidb' in main function of $RRPN_ROOT/tools/train_net.py

116: roidb = get_rroidb("train") # change to your data manage function

Download pre-trained ImageNet models

Pre-trained ImageNet models can be downloaded for the networks described in the paper: VGG16.

cd $RRPN_ROOT
./data/scripts/fetch_imagenet_models.sh

VGG16 comes from the Caffe Model Zoo, but is provided here for your convenience. ZF was trained at MSRA.

Then you can train RRPN by typing:

./experiment/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] rrpn

[NET] usually takes VGG16

Trained RRPN networks are saved under:(We set the directory to './' by default.)

./

One can change the directory in variable output_dir in $RRPN_ROOT/tools/train_net.py

Any question about this project please send message to Jianqi Ma([email protected]), and enjoy it!

docstrum

Docstrum Algorithm Getting Started This repo is for developing a Docstrum algorithm presented by O’Gorman (1993). Disclaimer This source code is built

Chulwoo Mike Pack 54 Dec 13, 2022
Computer vision applications project (Flask and OpenCV)

Computer Vision Applications Project This project is at it's initial phase. This is all about the implementation of different computer vision techniqu

Suryam Thapa 1 Jan 26, 2022
Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.

Dual Encoding for Video Retrieval by Text Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding

81 Dec 01, 2022
Text page dewarping using a "cubic sheet" model

page_dewarp Page dewarping and thresholding using a "cubic sheet" model - see full writeup at https://mzucker.github.io/2016/08/15/page-dewarping.html

Matt Zucker 1.2k Dec 29, 2022
[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks This is an official PyTorch code repository of the paper "Cloud Transformers:

Visual Understanding Lab @ Samsung AI Center Moscow 27 Dec 15, 2022
STEFANN: Scene Text Editor using Font Adaptive Neural Network

STEFANN: Scene Text Editor using Font Adaptive Neural Network @ The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020.

Prasun Roy 208 Dec 11, 2022
かの有名なあの東方二次創作ソング、「bad apple!」のMVをPythonでやってみたって話

bad apple!! 内容 このプログラムは、bad apple!(feat. nomico)のPVをPythonを用いて再現しよう!という内容です。 実はYoutube並びにGithub上に似たようなプログラムがあったしなんならそっちの方が結構良かったりするんですが、一応公開しますw 使い方 こ

赤紫 8 Jan 05, 2023
Recognizing the text contents from a scanned visiting card

Recognizing the text contents from a scanned visiting card. The application which is used to recognize the text from scanned images,printeddocuments,r

Faizan Habib 1 Jan 28, 2022
Python library to extract tabular data from images and scanned PDFs

Overview ExtractTable - API to extract tabular data from images and scanned PDFs The motivation is to make it easy for developers to extract tabular d

Org. Account 165 Dec 31, 2022
Using python libraries to track hands

Python-HandTracking Using python libraries to track hands on a camera Uses cv2 and mediapipe libraries custom hand tracking module PyCharm IDE Final E

Martin Matsudaira 1 Dec 17, 2021
Vietnamese Language Detection and Recognition

Table of Content Introduction (Khôi viết) Dataset (đổi link thui thành 3k5 ảnh mình) Getting Started (An Viết) Requirements Usage Example Training & E

6 May 27, 2022
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.2k Dec 31, 2022
Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Handwritten Text Recognition with TensorFlow Update 2021: more robust model, faster dataloader, word beam search decoder also available for Windows Up

Harald Scheidl 1.5k Jan 07, 2023
make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

Jinpeng 1.5k Dec 28, 2022
🖺 OCR using tensorflow with attention

tensorflow-ocr 🖺 OCR using tensorflow with attention, batteries included Installation git clone --recursive http://github.com/pannous/tensorflow-ocr

646 Nov 11, 2022
Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract Toolset U^2-Net is used for background removal Textcleaner is used for image cleaning

3 Jul 13, 2022
TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

FOTS: Fast Oriented Text Spotting with a Unified Network I am still working on this repo. updates and detailed instructions are coming soon! Table of

Masao Taketani 52 Nov 11, 2022
Textboxes : Image Text Detection Model : python package (tensorflow)

shinTB Abstract A python package for use Textboxes : Image Text Detection Model implemented by tensorflow, cv2 Textboxes Paper Review in Korean (My Bl

Jayne Shin (신재인) 91 Dec 15, 2022
【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿,我们会帮你完成一切✨

原神钓鱼辅助工具 ✨ 作者正在努力重构代码中……会尽快带给大家一个更完美的脚本 ✨ 「您只需抛出鱼竿,然后我们会帮您搞定一切」 如果你觉得这个脚本好用,请点一个 Star ⭐ ,你的 Star 就是作者更新最大的动力 点击这里 查看演示视频 ✨ 欢迎大家在 Issues 中分享自己的配置文件 ✨ ✨

261 Jan 02, 2023
deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications

Automatic Weapon Detection Deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications. Loved the pro

Janhavi 4 Mar 04, 2022