Form Segmentation

Let's explore how we can extract text from any forms / scanned pages.

Objectives

The goal is to find an algorithm that can extract the maximum information from a given page (jpg format). So, we can feed it to another system. (Business logic, neural network, classifier, etc.) The overall process may not be perfect. But it would be great if it can find enough information to identify the type of document and the involve identities.

Parse any form / scanned page and extract any text data (printed text and handwriting text). So, no prior knowledge of the layout / structure of the document.
Automatic extraction process (no human interaction. So, it can scale out)
Somehow fast (or the ability to speed up the task with more machines or CPU)

Challenges

There are many challenges to overcome. But the main problem is to identify which part of the form contains text.

Some other challenges:

Black Border Removal
ICR (Intelligent Character Recognition): recognize and convert hand-drawn characters into text
Scanned page (Detect edges and apply a perspective transform to obtain the top-down view of the document)
Remove noise (blur, OTSU, adaptivethreshold with opencv)
Shape detection and extraction
OCR (Not a real issue since we can use : Tesseract 4 great for printed text)
Handwriting recognition
Minimize errors

Let's explore how we can extract text from forms

Related tags

Overview

Form Segmentation

Objectives

Challenges

Owner

Philip Doxakis

A curated list of papers and resources for scene text detection and recognition

ocroseg - This is a deep learning model for page layout analysis / segmentation.

Geometric Augmentation for Text Image

原神风花节自动弹琴辅助

This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, rastreia padrões de gestos em vez de um mouse físico.

([email protected]) Boosting Co-teaching with Compression Regularization for Label Noise

This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

Rubik's Cube in pygame with OpenGL

This tool will help you convert your text to handwriting xD

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラム

Generates a message from the infamous Jerma Impostor image

TextBoxes++: A Single-Shot Oriented Scene Text Detector

scene-linear test images

This Repository contain Opencv Projects in python

ARU-Net - Deep Learning Chinese Word Segment

kaldi-asr/kaldi is the official location of the Kaldi project.