Recognizing the text contents from a scanned visiting card

Overview

Recognizing the text contents from a scanned visiting card.

The application which is used to recognize the text from scanned images,printeddocuments,receipt,visitingand business cards. The project involves the creation of real-time text recognition from a scanned document or image and then text is displayed on the screen using tesseract OCR engine.

Introduction -

OCR = Optical Character Recognition. In other words, OCR systems transform a two-dimensional image of text that could contain machine printed or handwritten text from its image representation into machine-readable text. OCR as a process generally consists of several sub-processes to perform as accurately as possible.

Features

  • PreprocessingoftheImage
  • TextLocalization
  • CharacterSegmentation
  • CharacterRecognition
  • PostProcessing

In OCR software, it’s main aim is to identify and capture all the unique words using different languages from written text characters.

Tesseract OCR -

Tesseract is an open source text recognition (OCR) Engine, available under the Apache 2.0 license. It can be used directly, or (for programmers) using an API to extract printed text from images. It supports a wide variety of languages.. It can be used with the existing layout analysis to recognize text within a large document, or it can be used in conjunction with an external text detector to recognize text from an image of a single text line.

Packages And Modules Used

Pytesseract -

Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and “read” the text from an image. It is also useful as a stand-alone invocation script to tesseract, as it can read all image types supported by the Pillow and Leptonica imaging libraries, including jpeg, png, gif, bmp, tiff, and others.

INSTALLATION (Python-tesseract requires Python 3.6+)

Installation

Install my-project with npm

  pip install pytesseract

Method used -

pytesseract.image_to_string() - Returns the result of a Tesseract OCR run on the provided image to string

pytesseract.image_to_data() - Returns string containing box boundaries, confidences,and other information .

Opencv -

OpenCV is an open-source library for computer vision, machine learning, and image processing. It can process images and videos to identify objects, faces, or even the handwriting of a human. When it is integrated with various libraries, such as Numpy which is a highly optimized library for numerical python, then the number of weapons increases in your Arsenal i.e whatever operations one can do in Numpy can be combined with OpenCV.

Installation

Install my-project with npm

  pip install opencv-python

Methods used -

cv2.imread() The function imread loads an image from the specified file and returns it. If the image cannot be read (because of missing file, improper permissions, unsupported or invalid format), the function returns an empty matrix ( Mat::data==NULL )

cv2.resize() The function resizes the image src down to or up to the specified size. note that the initial dst type or size are not taken into account. Instead, the size and type are derived from the src,dsize,fx, and fy.

cv2.cvtcolor() The function converts an input image from one color space to another. In case of a transformation to-from RGB color space, the order of the channels should be specified explicitly (RGB or BGR). Note that the default color format in OpenCV is often referred to as RGB but it is actually BGR (thebytes are reversed).

cv2.thresh() - The function applies fixed-level thresholding to a multiple-channel array. The function is typically used to get a bi-level (binary) image out of a grayscale image ( compare could be also used for this purpose) or for removing a noise, that is, filtering out pixels with too small or too large values.

cv2.imshow() - The function imshow displays an image in the specified window. If the window was created with the cv::WINDOW_AUTOSIZE flag, the image is shown with its original size, however it is still limited by the screen resolution.Otherwise,theimageisscaledtofitthewindow.

cv2.waitkey()- The function waitKey waits for a key event infinitely when it is 0 or for delay milliseconds, when it is positive. Since the OS has a minimum time between switching threads, the function will not wait exactly to delay ms, it will wait at least delay ms, depending on what else is running on your computer at that time.

cv2.rectangle() - The function cv::rectangle draws a rectangle outline or a filled rectangle whose two opposite corners are pt1 and pt2.

Conclusion -

It was a simple approach to process the image or scanned document and extract the contents in the that, using pytesseract and openCV libraries. We successfully performed the activity according to the given problem statement.

Owner
Faizan Habib
Faizan Habib
Open Source Differentiable Computer Vision Library for PyTorch

Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer

kornia 7.6k Jan 04, 2023
A little but useful tool to explore OCR data extracted with `pytesseract` and `opencv`

Screenshot OCR Tool Extracting data from screen time screenshots in iOS and Android. We are exploring 3 options: Simple OCR with no text position usin

Gabriele Marini 1 Dec 07, 2021
EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

DeepCam Shenzhen 141 Dec 31, 2022
Handwritten_Text_Recognition

Deep Learning framework for Line-level Handwritten Text Recognition Short presentation of our project Introduction Installation 2.a Install conda envi

24 Jul 15, 2022
Convert scans of handwritten notes to beautiful, compact PDFs

Convert scans of handwritten notes to beautiful, compact PDFs

Matt Zucker 4.8k Jan 01, 2023
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless. This is the official Roboflow python package that interfaces with the Roboflow API.

Roboflow 52 Dec 23, 2022
Camelot: PDF Table Extraction for Humans

Camelot: PDF Table Extraction for Humans Camelot is a Python library that makes it easy for anyone to extract tables from PDF files! Note: You can als

Atlan Technologies Pvt Ltd 3.3k Dec 31, 2022
Pixie - A full-featured 2D graphics library for Python

Pixie - A full-featured 2D graphics library for Python Pixie is a 2D graphics library similar to Cairo and Skia. pip install pixie-python Features: Ty

treeform 65 Dec 30, 2022
Page to PAGE Layout Analysis Tool

P2PaLA Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks. 💥 Try our new DEMO for online baseli

Lorenzo Quirós Díaz 180 Nov 24, 2022
Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract Toolset U^2-Net is used for background removal Textcleaner is used for image cleaning

3 Jul 13, 2022
Natural language detection

Detect the language of text. What’s so cool about franc? franc can support more languages(†) than any other library franc is packaged with support for

Titus 3.8k Jan 02, 2023
WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching Code based on our WACV 2022 Accepted Paper: https://arxiv.org/pdf/

Andres 13 Dec 17, 2022
Python Computer Vision from Scratch

This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both f

Milaan Parmar / Милан пармар / _米兰 帕尔马 221 Dec 26, 2022
Demo processor to illustrate OCR-D Python API

ocrd_vandalize/ Demo processor to illustrate the OCR-D/core Python API Description :TODO: write docs :) Installation From PyPI pip3 install ocrd_vanda

Konstantin Baierer 5 May 05, 2022
This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

alebogado 1 Jan 27, 2022
Framework for the Complete Gaze Tracking Pipeline

Framework for the Complete Gaze Tracking Pipeline The figure below shows a general representation of the camera-to-screen gaze tracking pipeline [1].

Pascal 20 Jan 06, 2023
Handwritten Number Recognition using CNN and Character Segmentation

Handwritten-Number-Recognition-With-Image-Segmentation Info About this repository This Repository is aimed at reading handwritten images of numbers an

Sparsha Saha 17 Aug 25, 2022
A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

ocrserver Simple OCR server, as a small working sample for gosseract. Try now here https://ocr-example.herokuapp.com/, and deploy your own now. Deploy

Hiromu OCHIAI 541 Dec 28, 2022
Give a solution to recognize MaoYan font.

猫眼字体识别 该 github repo 在于帮助xjtlu的同学们识别猫眼的扭曲字体。已经打包上传至 pypi ,可以使用 pip 直接安装。 猫眼字体的识别不出来的原理与解决思路在采茶上 使用方法: import MaoYanFontRecognize

Aruix 4 Jun 30, 2022