Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Last update: Jul 13, 2022

Related tags

Computer Vision u2netscan

Overview

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Toolset

U^2-Net is used for background removal
Textcleaner is used for image cleaning and line deskew (max 5 degrees)
Tesseract is used for text angle rotation
Deskew is used for line deskew (between 5 and 45 degrees)

Examples

Tested one document on smartphone camera with different angles

To build & deploy

Clone thee repo
Download the model: check app/saved_models/README.md
Build Docker image : docker build -t / : .
Test locally : Run Docker image and check if api is working by running http://localhost:10000
- CPU : docker run -it -v $PWD:/LOCAL/ -p 10000:80 / :
- GPU : docker run -it --gpus all -v $PWD:/LOCAL/ -p 10000:80 / :
Push docker image to Dockerhub (optional):
- Check: https://docs.docker.com/docker-hub/repos/ for account setup
- Create in Dockerhub Repo similar to the name of yout Image ID :
- Run docker push / :
Deploy to Cloud Run (optional):
- Create your google cloud account
- Push Docker Image to Google Container Registry
  - create new project called [PROJECT-ID]
  - Open Cloud shell in your Google account and run: docker pull / : docker tag [IMAGE] gcr.io/[PROJECT-ID]/[IMAGE] docker push gcr.io/[PROJECT-ID]/[IMAGE] more detail in this link
- Create CloudRun Service, and select Container that was created
  - Screenshot of the config - for demo purpose, it will be cost free
- Click Deploy, and test the Api Url that will display

Limits and Areas for improvements

Speed: It takes 7 to 10 seconds to process one image (serverless Cloud Run) With Gpu we can save 2 to 3 seconds (U^2-Net is 3 times faster)
Textcleaner is slow but works better on image cleaning, but needs some manual fine-tuning

References

U^2-Net https://github.com/xuebinqin/U-2-Net.git
Textcleaner http://www.fmwconcepts.com/imagemagick/textcleaner/
Tesseract https://github.com/tesseract-ocr/tesseract
Deskew https://github.com/sbrunner/deskew.git

Owner

AI

GitHub Repository https://amtam0.github.io/u2netscan/webapp/app_u2net.html

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition PDF Abstract Explainable artificial intelligence has been gaining attention

87 Dec 26, 2022

基于openpose和图像分类的手语识别项目

手语识别 0、使用到的模型 (1). openpose，作者：CMU-Perceptual-Computing-Lab https://github.com/CMU-Perceptual-Computing-Lab/openpose (2). 图像分类classification，作者：Bubbl

20 Dec 15, 2022

Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Detecting Text in Natural Image with Connectionist Text Proposal Network The codes are used for implementing CTPN for scene text detection, described

1.3k Dec 22, 2022

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports "with"-syntax.

0 Oct 30, 2021

A synthetic data generator for text recognition

TextRecognitionDataGenerator A synthetic data generator for text recognition What is it for? Generating text image samples to train an OCR software. N

2.5k Jan 04, 2023

A tensorflow implementation of EAST text detector

EAST: An Efficient and Accurate Scene Text Detector Introduction This is a tensorflow re-implementation of EAST: An Efficient and Accurate Scene Text

2.9k Jan 02, 2023

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

307 Jan 03, 2023

Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture

Handwriting Recognition System This repository is the Tensorflow implementation of the Handwriting Recognition System described in Handwriting Recogni

346 Jan 07, 2023

DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

3.1k Jan 05, 2023

Single Shot Text Detector with Regional Attention

Single Shot Text Detector with Regional Attention Introduction SSTD is initially described in our ICCV 2017 spotlight paper. A third-party implementat

215 Dec 07, 2022

Open Source Differentiable Computer Vision Library for PyTorch

Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer

7.6k Jan 04, 2023

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract Toolset U^2-Net is used for background removal Textcleaner is used for image cleaning

3 Jul 13, 2022

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

hocr-tools About About the code Installation System-wide with pip System-wide from source virtualenv Available Programs hocr-check -- check the hOCR f

285 Dec 08, 2022

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021)

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021) Yuanming Hu, Jiafeng Liu, Xuanda Yang, Mingkuan Xu, Ye Kuang, Weiwei Xu, Qiang Dai, W

119 Dec 02, 2022

An organized collection of tutorials and projects created for aspriring computer vision students.

A repository created with the purpose of teaching students in BME lab 308A- Hanoi University of Science and Technology

5 Nov 24, 2021

Generates a message from the infamous Jerma Impostor image

Generate your very own jerma sus imposter message. Modes: Default Mode: Only supports the characters " ", !, a, b, c, d, e, h, i, m, n, o, p, q, r, s,

1 Oct 27, 2022

An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive

1 Feb 12, 2022

Binarize document images

Binarization Binarization for document images Examples Introduction This tool performs document image binarization (i.e. transform colour/grayscale to

48 Jan 02, 2023

Opencv face recognition desktop application

Opencv-Face-Recognition Opencv face recognition desktop application Program developed by Gustavo Wydler Azuaga - 2021-11-19 Screenshots of the program

1 Nov 19, 2021

Convert Text-to Handwriting Using Python

Convert Text-to Handwriting Using Python Description In this project we'll use python library that's "pywhatkit" for converting text to handwriting. t

8 Nov 19, 2022