The CIS OCR Post Correction Tool PoCoTo

Source code for the Java-based PoCoTo client enabling fast interactive batch corrections of complete OCR error series in OCR'ed historical documents. For a detailed description see the PoCoTo Manual.

The lastest compiled binary can be downloaded here.

References

PoCoTo has originally been written by Thorsten Vobl as part of his master's thesis in computational linguistics at CIS during the IMPACT project.

It has been further developed as a CLARIN-D Kurationsprojekt by Florian Fink and Uwe Springmann at CIS.

Its underlying technology is described in the following publication:

Vobl, Thorsten, Annette Gotscharek, Uli Reffle, Christoph Ringlstetter, and Klaus U. Schulz. 2014. “PoCoTo - an Open Source System for Efficient Interactive Postcorrection of OCRed Historical Texts.” In Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage, 57–61. DATeCH ’14. New York, NY, USA: ACM. doi:http://doi.org/10.1145/2595188.2595197.

The CIS OCR PostCorrectionTool

Related tags

Overview

The CIS OCR Post Correction Tool PoCoTo

References

Owner

CIS OCR Group

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Repository for Scene Text Detection with Supervised Pyramid Context Network with tensorflow.

Vietnamese Language Detection and Recognition

Automatically resolve RidderMaster based on TensorFlow & OpenCV

Pixel art search engine for opengameart

Ddddocr - 通用验证码识别OCR pypi版

Corner-based Region Proposal Network

Detect the mathematical formula from the given picture and the same formula is extracted and converted into the latex code

Generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv

Deskewing images with slanted content

Educational application aimed at automating user-defined workflows for the mobile game, "Granblue Fantasy", using a variety of CV technologies in the backend such as OpenCV, PyAutoGUI and EasyOCR and a frontend coded in Typescript.

Automatically fishes for you while you are afk :)

M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラム

An OCR evaluation tool

Camelot: PDF Table Extraction for Humans

Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

Code release for Hu et al., Learning to Segment Every Thing. in CVPR, 2018.

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

Text language identification using Wikipedia data

Awesome Spectral Indices in Python.