A Number Recognition algorithm

Last update: Nov 12, 2021

Related tags

Overview

Paddle-VisualAttention

Results_Compared

Methods	Steps	GPU	Batch Size	Learning Rate	Patience	Decay Step	Decay Rate	Training Speed (FPS)	Accuracy
PaddlePaddle_SVHNClassifier	54000	GTX 1080 Ti	1024	0.01	100	625	0.9	~1700	95.65%
Pytorch_SVHNClassifier	54000	GTX 1080 Ti	512	0.16	100	625	0.9	~1700	95.65%

Introduction

The main idea of this exercise is to study the evolvement of the state of the art and main work along topic of visual attention model. There are two datasets that are studied: augmented MNIST and SVHN. The former dataset focused on canonical problem — handwritten digits recognition, but with cluttering and translation, the latter focus on real world problem — street view house number (SVHN) transcription. In this exercise, the following papers are studied in the way of developing a good intuition to choose a proper model to tackle each of the above challenges.

For more detail, please refer to this blog

Recommended environment

Python 3.6+
paddlepaddle-gpu 2.0.2
nccl 2.0+
editdistance
visdom
h5py
protobuf
lmdb

Install

Install env

Install paddle following the official tutorial.

pip install visdom
pip install h5py
pip install protobuf
pip install lmdb

Dataset

Download SVHN Dataset format 1

Extract to data folder, now your folder structure should be like below:

SVHNClassifier
    - data
        - extra
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat
        - test
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat
        - train
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat

Usage

(Optional) Take a glance at original images with bounding boxes
```
Open `draw_bbox.ipynb` in Jupyter
```

Convert to LMDB format

$ python convert_to_lmdb.py --data_dir ./data

(Optional) Test for reading LMDBs

Open `read_lmdb_sample.ipynb` in Jupyter

Train

$ python train.py --data_dir ./data --logdir ./logs

Retrain if you need

$ python train.py --data_dir ./data --logdir ./logs_retrain --restore_checkpoint ./logs/model-100.pth

Evaluate

$ python eval.py --data_dir ./data ./logs/model-100.pth

Visualize

$ python -m visdom.server
$ python visualize.py --logdir ./logs

Infer

$ python infer.py --checkpoint=./logs/model-100.pth ./images/test1.png

Clean

$ rm -rf ./logs
or
$ rm -rf ./logs_retrain

A Number Recognition algorithm

Related tags

Overview

Paddle-VisualAttention

Results_Compared

Introduction

Recommended environment

Install

Install env

Dataset

Usage

Owner

Implementation of Google Brain's WaveGrad high-fidelity vocoder

Codes for [NeurIPS'21] You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership.

7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

PyTorch implementation for Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition.

Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Python lib to talk to pylontech lithium batteries (US2000, US3000, ...) using RS485

Driller: augmenting AFL with symbolic execution!

Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

✨✨✨An awesome open source toolbox for stereo matching.

An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"

Project page for the paper Semi-Supervised Raw-to-Raw Mapping 2021.

StyleGAN2 Webtoon / Anime Style Toonify

Defending graph neural networks against adversarial attacks (NeurIPS 2020)

Keras like implementation of Deep Learning architectures from scratch using numpy.

COPA-SSE contains crowdsourced explanations for the Balanced COPA dataset

A neuroanatomy-based augmented reality experience powered by computer vision. Features 3D visuals of the Atlas Brain Map slices.

Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study

PyTorch implementation of Histogram Layers from DeepHist: Differentiable Joint and Color Histogram Layers for Image-to-Image Translation

Crawl & visualize ICLR papers and reviews

Download and preprocess popular sequential recommendation datasets