OCR system for Arabic language that converts images of typed text to machine-encoded text.

Last update: Jan 05, 2023

Overview

Arabic OCR

OCR system for Arabic language that converts images of typed text to machine-encoded text.
The system currently supports only letters (29 letters) ا-ى , لا.
The system aims to solve a simpler problem of OCR with images that contain only Arabic characters (check the dataset link below to see a sample of the images).

Setup

Install python then run this command:

pip install -r requirements.txt

Run

Put the images in src/test directory
Go to src directory and run the following command
```
python OCR.py
```
Output folder will be created with:
- text folder which has text files corresponding to the images.
- running_time file which has the time taken to process each image.

Pipeline

Dataset

Link to dataset of images and the corresponding text: here.
We used 1000 images to generate character dataset that we used for training.

Examples

Line Segmentation

Word Segmentation

Character Segmentation

Performance

Average accuracy: 95%.
Average time per image: 16 seconds.

NOTE

We achieved these results when we used only the flatten image as feature.

OCR system for Arabic language that converts images of typed text to machine-encoded text.

Related tags

Overview

Arabic OCR

Setup

Run

Pipeline

Dataset

Examples

Line Segmentation

Word Segmentation

Character Segmentation

Performance

References

Owner

Hussein Youssef

一款基于Qt与OpenCV的仿真数字示波器

computer vision, image processing and machine learning on the web browser or node.

OpenMMLab Text Detection, Recognition and Understanding Toolbox

A simple python program to record security cam footage by detecting a face and body of a person in the frame.

Handwritten Number Recognition using CNN and Character Segmentation

Repository collecting all the submodules for the new PyTorch-based OCR System.

Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Simple SDF mesh generation in Python

A Screen Translator/OCR Translator made by using Python and Tesseract, the user interface are made using Tkinter. All code written in python.

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

Color Picker and Color Detection tool for METR4202

Camelot: PDF Table Extraction for Humans

How to detect objects in real time by using Jupyter Notebook and Neural Networks , by using Yolo3

The papers published in top-tier AI conferences in recent years.

Bu uygulamada Python ve Opencv kullanarak bilgisayar kamerasından yüz tespiti yapıyoruz.

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Python rubik's cube solver

Pixel art search engine for opengameart