It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

Last update: Jul 11, 2022

Related tags

Overview

OCR-Tool

It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever python project so feel free to make any suggestions

Release

To install it, extract the zip that you downloaded. Put it in a folder like program files. There is a file called OCR-Tool.exe. You could make a shortcut and put it on your desktop for easy access.
Windows might say it's dangerous to run and block it. Just click "more info" and there you can run it.
If you download the version without tesseract included please check the dependencies section for instructions on how to add it.

Version 1.1

With tesseract

https://drive.google.com/file/d/1EMS8cKsasorLRXpqVjLxo41nsAEk4SiF/view?usp=sharing

Without tesseract

https://drive.google.com/file/d/1O4EYF9EmawT0VRSM6U1XkDBndGQBxDRe/view?usp=sharing

Features

Modern GUI
Snipping tool (Credit to harupy's python snipping tool)
Open image from folder
Paste image from clipboard
Save text to .txt
Copy text to clipboard
Cancel snip

Dependencies

Tesseract OCR Engine (UB Mannheim). Install either version 4 or 5. 5 is recommended as it performs better. Look for the installlation folder, the default is program files. Copy it into the same folder as main.py and rename the folder to "tesseract"
Pytesseract
PyQt5

Known bugs

Copy text button crashes app
White text doesnt work well

Future features

Better preprocessing to help with weird backgrounds
Document ocr
Menu

make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过，需要virtualenv安装，安装路径可自行调整： git clone https://github.com/JinpengLI/deep

1.5k Dec 28, 2022

Convert PDF/Image to TXT using EasyOcr - the best OCR engine available!

PDFImage2TXT - DOWNLOAD INSTALLER HERE What can you do with it? Convert scanned PDFs to TXT. Convert scanned Documents to TXT. No coding required!! In

2 Feb 22, 2022

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract Toolset U^2-Net is used for background removal Textcleaner is used for image cleaning

3 Jul 13, 2022

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)

Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a

684 Jan 6, 2023

A Python wrapper for Google Tesseract

Python Tesseract Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and "read" the text embedded i

4.6k Jan 6, 2023

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

27.5k Jan 8, 2023

OCR engine for all the languages

Description kraken is a turn-key OCR system optimized for historical and non-Latin script material. kraken's main features are: Fully trainable layout

431 Jan 4, 2023

Python tool that takes the OCR.space JSON output as input and draws a text overlay on top of the image.

OCR.space OCR Result Checker = Draw OCR overlay on top of image Python tool that takes the OCR.space JSON output as input, and draws an overlay on to

4 Oct 18, 2022

A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

933 Dec 29, 2022

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

Related tags

Overview

OCR-Tool

Release

Version 1.1

With tesseract

Without tesseract

Features

Dependencies

Known bugs

Future features

You might also like...

make a better chinese character recognition OCR than tesseract

Convert PDF/Image to TXT using EasyOcr - the best OCR engine available!

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

A Python wrapper for Google Tesseract

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

OCR engine for all the languages

Python tool that takes the OCR.space JSON output as input and draws a text overlay on top of the image.

A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Releases(Release)

Release(Oct 15, 2021)

Changes

Owner

Khant Htet Aung

Character Segmentation using TensorFlow

OCR of Chicago 1909 Renumbering Plan

The papers published in top-tier AI conferences in recent years.

~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

SRA's seminar on Introduction to Computer Vision Fundamentals

Python Computer Vision Aim Bot for Roblox's Phantom Forces

A tool to enhance your old/damaged pictures built using python & opencv.

Shape Detection - It's a shape detection project with OpenCV and Python.

Write-ups for the SwissHackingChallenge2021 CTF.

Text page dewarping using a "cubic sheet" model

Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that slide and lock together.

Recognizing cropped text in natural images.

CNN+LSTM+CTC based OCR implemented using tensorflow.

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

A toolbox of scene text detection and recognition

Deskewing images with slanted content

[EMNLP 2021] Improving and Simplifying Pattern Exploiting Training

Python tool that takes the OCR.space JSON output as input and draws a text overlay on top of the image.

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.