Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Last update: Aug 07, 2022

Related tags

Deep Learning NonLatinPhotoOCR

Overview

Convolutional Recurrent Neural Network + CTCLoss | STAR-Net

Code for paper "Towards Boosting the Accuracy of Non-Latin Scene Text Recognition"

Dependence

Python3.6.5
torch==1.2.0
torchvision==0.4.0
tensorboard==2.3.0

How to run the code?

Prepare data

Follow the instructions in meijieru/crnn.pytorch to create lmdb datasets. Use the same step to create train and val data.

Change parameters and alphabets

Please update the parameters and alphabets according to the requirement.

Change parameters in the mytrain.py file
Change alphabets

Please put all the alphabets that appear in your labels in a file and input the list as charlist to mytrain.py, else the program will throw an error during training.

Train

Run mytrain.py -

python3 mytrain.py --trainRoot /ssd_scratch/cvit/sanjana/hindi-train-lmdb \
--valRoot /ssd_scratch/cvit/sanjana/hindi-test-lmdb \
--arch crnn --lan hindi --charlist /ssd_scratch/cvit/sanjana/crnn_new/lexicon.txt \
--batchSize 32 --nepoch 15 --cuda --expr_dir /ssd_scratch/cvit/sanjana \
--displayInterval 10 --valInterval 100 --adadelta \ 
--manualSeed 1234 --random_sample --deal_with_lossnan

Reference

meijieru/crnn.pytorch
Sierkinhane/crnn_chinese_characters_rec

If you use the dataset or code from this work, please add the following citation:-

@inproceedings{gunnaNonLatin2021,
  title={Towards {B}oosting the {A}ccuracy of {N}on-{L}atin {S}cene {T}ext {R}ecognition,
  author={Sanjana Gunna and Rohit Saluja and C V Jawahar},
  booktitle={2021 International Conference on Document Analysis and Recognition Workshops (ICDARW)},
  year={2021},
  organization={IEEE}
}

Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Related tags

Overview

Convolutional Recurrent Neural Network + CTCLoss | STAR-Net

Dependence

How to run the code?

Prepare data

Change parameters and alphabets

Train

Reference

Owner

Sanjana Gunna

Learning to Initialize Neural Networks for Stable and Efficient Training

FANet - Real-time Semantic Segmentation with Fast Attention

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Kohei's 5th place solution for xview3 challenge

Official Code Release for Container : Context Aggregation Network

Neural Message Passing for Computer Vision

natural image generation using ConvNets

Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation

An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicity.

A comprehensive list of published machine learning applications to cosmology

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Robust fine-tuning of zero-shot models

The hippynn python package - a modular library for atomistic machine learning with pytorch.

Official Implementation (PyTorch) of "Point Cloud Augmentation with Weighted Local Transformations", ICCV 2021

PyTorch implementation for Convolutional Networks with Adaptive Inference Graphs

Code for "Long Range Probabilistic Forecasting in Time-Series using High Order Statistics"

Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective

RetinaFace: Deep Face Detection Library in TensorFlow for Python