An interactive document scanner built in Python using OpenCV

Last update: Feb 12, 2022

Related tags

Overview

Document Scanner

An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive color threshold to clean up the image.

On my test dataset of 280 images, the program correctly detected the corners of the document 92.8% of the time.

This project makes use of the transform and imutils modules from pyimagesearch (which can be accessed here). The UI code for the interactive mode is adapted from poly_editor.py from here.

You can manually click and drag the corners of the document to be perspective transformed:
The scanner can also process an entire directory of images automatically and save the output in an output directory:

Here are some examples of images before and after scan:

Usage

python scan.py (--images 
   
     | --image 
    
     ) [-i]

The -i flag enables interactive mode, where you will be prompted to click and drag the corners of the document. For example, to scan a single image with interactive mode enabled:

python scan.py --image sample_images/desk.JPG -i

Alternatively, to scan all images in a directory without any input:

python scan.py --images sample_images

An interactive document scanner built in Python using OpenCV

Related tags

Overview

Document Scanner

An interactive document scanner built in Python using OpenCV

Here are some examples of images before and after scan:

Usage

Owner

Kushal Shingote

Driver Drowsiness Detection with OpenCV & Dlib

Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition

A fastai/PyTorch package for unpaired image-to-image translation.

This repo contains a script that allows us to find range of colors in images using openCV, and then convert them into geo vectors.

Smart computer vision application

Python rubik's cube solver

[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

A selectional auto-encoder approach for document image binarization

This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

pyntcloud is a Python library for working with 3D point clouds.

YOLOv5 in DOTA with CSL_label.(Oriented Object Detection)（Rotation Detection）（Rotated BBox）

The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

POT : Python Optimal Transport

Solution for Problem 1 by team codesquad for AIDL 2020. Uses ML Kit for OCR and OpenCV for image processing

The world's simplest facial recognition api for Python and the command line

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"

Convert Text-to Handwriting Using Python

question‘s area recognition using image processing and regular expression