Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

This repository contains the setup for all experiments performed in our Paper ... It is to be used in conjunction with the RL environment text-localization-environment, which is linked as a submodule. After cloning do git submodule init and git submodule update and follow the installation instructions of that repo.

The project is configured using Hydra in the cfg folder.

Training

We use RLLib as RL framework. Train the model by executing rllib_train.py.

Every value in the cfg folder can be altered by passing it as a CLI argument, while keeping the correct file hierarchy (e.g. data.path=/data). The folder data contains templates for different dataset configurations.

Here are explanations for a few example parameters.

Parameter	Description	default
neptune.offline	disables logging to neptune.ai	true
training.iterations	how long to train	5000
training.epsilon.decay_steps	length of exploration	300000
data.dataset	dataset type	icdar2013
data.path	path to dataset	/data/ICDAR2013
data.json_path	path to json file of data (for SynthText)	null
data.eval_path	path to evaluation dataset	/data/ICDAR2013
data.eval_gt_file	gt zip file for IC13/IC15/TIoU eval scripts	icdar13_gt.zip

Training weakly supervised:

Parameter	Description
assessor.data_path	path to assessor training data for on-the-fly training of the assessor
assessor.checkpoint	path to assessor PyTorch (.pt) file. A pretained model can be downloaded here.

Loading a checkpoint:

Checkpoints need to be RLLib checkpoint folders. Our best three models (supervised, weakly supervised and semi-supervised) can be downloaded here.

Set the parameter restore to the checkpoint directory. Training will resume from the checkpoint. The training iterations have to be increased, as the checkpoints were made at iteration 15k.

Testing

Execute evaluate.py.

python evaluate.py 
    
     
     
       --dataset icdar2013 [--framestacking grayscale]

Tips

For IDE debugging change ray.init() in rllib_train.py to ray.init(local_mode=True).

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

Related tags

Overview

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

Training

Testing

Tips

Owner

Emanuel Metzenthin

Intro-to-dl - Resources for "Introduction to Deep Learning" course.

Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

Application of K-means algorithm on a music dataset after a dimensionality reduction with PCA

J.A.R.V.I.S is an AI virtual assistant made in python.

Official PyTorch implementation of "The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation" (ICCV 21).

Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting

[NeurIPS 2021] Source code for the paper "Qu-ANTI-zation: Exploiting Neural Network Quantization for Achieving Adversarial Outcomes"

A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

style mixing for animation face

PushForKiCad - AISLER Push for KiCad EDA

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

FastCover: A Self-Supervised Learning Framework for Multi-Hop Influence Maximization in Social Networks by Anonymous.

A PyTorch implementation of Implicit Q-Learning

Deep Q-Learning Network in pytorch (not actively maintained)

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis

Provide baselines and evaluation metrics of the task: traffic flow prediction

Implement A3C for Mujoco gym envs

Multi-modal co-attention for drug-target interaction annotation and Its Application to SARS-CoV-2

My course projects for the 2021 Spring Machine Learning course at the National Taiwan University (NTU)