PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Last update: Jul 20, 2022

Related tags

Overview

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

The unofficial code of CDistNet.

Now, we have implemented all the modules according to the papaer except for TPS in the visual branch.You can refer ASTER for the implementation of TPS.

Requirements

Python3.6.8
lmdb==0.98
torch==1.5.1
torchvision==0.6.1
Pillow==6.1.0
opencv-python==4.2.0.32
numpy==1.17.1

Data preparation

We offer you a tool to transform raw dataset to LMDB dataset. Details please refer to tools/create_lmdb_dataset.py

You can also download lmdb dataset from OCR_Dataset

Train

First you need to modify some arguments in configs/cdistnet.yml.

TrainReader set the path of train lmdb dataset.
EvalReader set the path of evaluation lmdb dataset.
Global set the args like image_shape, dict_file, etc.
VisualModule set the args of visual branch in the original paper.
PositionalEmbedding set the args of positional branch.
SemanticEmbedding set the args of semantic branch.
MDCDP set the args of MDCDP.

python train.py -c configs/cdistnet.yml

Demo

Modify these arguments below in configs/cdistnet.yml.

pretrain_weights set the path of model file path.
infer_img set the image path.
`is_train set to False.

python predict.py -c configs/cdistnet.yml

TODO

Pretrained models
Test code
Comparison with original paper on benchmarks(CUTE, IC13, IC15, IIIT5K, SVT, SVTP)

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Related tags

Overview

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Requirements

Data preparation

Train

Demo

TODO

Owner

Code repository for "Free View Synthesis", ECCV 2020.

AI-Bot - 一个基于watermelon改造的OpenAI-GPT-2的智能机器人

MiraiML: asynchronous, autonomous and continuous Machine Learning in Python

A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography

Resources for the Ki testnet challenge

Learning Visual Words for Weakly-Supervised Semantic Segmentation

Optical Character Recognition + Instance Segmentation for russian and english languages

Official repository for "On Improving Adversarial Transferability of Vision Transformers" (2021)

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Evaluation framework for testing segmentation networks in PyTorch

PyGAD, a Python 3 library for building the genetic algorithm and training machine learning algorithms (Keras & PyTorch).

SingleVC performs any-to-one VC, which is an important component of MediumVC project.

The original weights of some Caffe models, ported to PyTorch.

An unsupervised learning framework for depth and ego-motion estimation from monocular videos

Lite-HRNet: A Lightweight High-Resolution Network

Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

DeepDiffusion: Unsupervised Learning of Retrieval-adapted Representations via Diffusion-based Ranking on Latent Feature Manifold

This repo contains source code and materials for the TEmporally COherent GAN SIGGRAPH project.

Final report with code for KAIST Course KSE 801.

Cascading Feature Extraction for Fast Point Cloud Registration (BMVC 2021)