Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Last update: Oct 10, 2022

Overview

Scene Text-Spotting based on PSEnet+CRNN

Pytorch implementation of an end to end Text-Spotter with a PSEnet text detector and CRNN text recognizer. We plan to grow this repository into an open research platform for multi-lingual text detection and recognition from natural scene images, targeted towards low-resource languages.

Requirements

Python 3.6.5
Pytorch 1.2
pyclipper
Polygon 3.0.8
OpenCV 3.4.1

Demo

Download the trained CRNN and PSEnet models from the links provided below.
Copy paths of the models and paste them in params.py
run end-end.py

python end-end.py --img [path to image] --e2e_config_name [end to end config name]

Pre-trained Models

Both PSEnet and CRNN pre-trained models can be found here: gdrive

the PSEnet model is a multi-lingual text detector, trained on MLT 2019. Works quite well!
the CRNN recognizes Hindi, Bangla, Malayalam, Kanada, Tamil, Telugu, Odia, Sanskrit, Marathi!

Download the models in models/ directory and modify params.py if required.

Training instructions

To train your own detection model refer to this file.
To train your own recognition model refer to this file.

Samples

Contributors

Azhar Shaikh, PES University LinkedIn
Nishant Sinha, OffNote Labs

Work done as part of Internship with OffNote Labs.

References

If this repository helps you, please star it. Thank you!

Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Related tags

Overview

Scene Text-Spotting based on PSEnet+CRNN

Requirements

Demo

Pre-trained Models

Training instructions

Samples

Contributors

References

Owner

azhar shaikh

A machine learning software for extracting information from scholarly documents

Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition

Driver Drowsiness Detection with OpenCV & Dlib

This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

fishington.io bot with OpenCV and NumPy

Some bits of javascript to transcribe scanned pages using PageXML

A tool to make dumpy among us GIFS

TextBoxes re-implement using tensorflow

An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

Run tesseract with the tesserocr bindings with @OCR-D's interfaces

Automatically fishes for you while you are afk :)

Aloception is a set of package for computer vision: aloscene, alodataset, alonet.

A curated list of promising OCR resources

CRAFT-Pyotorch：Character Region Awareness for Text Detection Reimplementation for Pytorch

This is a c++ project deploying a deep scene text reading pipeline with tensorflow. It reads text from natural scene images. It uses frozen tensorflow graphs. The detector detect scene text locations. The recognizer reads word from each detected bounding box.

deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications

This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

Contextual speed detection for python

Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"