docstrum

Last update: Dec 13, 2022

Related tags

Computer Vision docstrum

Overview

Docstrum Algorithm

Getting Started

This repo is for developing a Docstrum algorithm presented by O’Gorman (1993).

Disclaimer

This source code is built on top of the work by Chadoliver. Please find the original code from here (https://github.com/chadoliver/cosc428-structor).

Objective

This project aims at segmenting a document image into meaningful components. The domain of image is specified on historical machine-printed/hand-written document image.

Dependencies

python 2.7
Packages:
- numpy
- cv2

Process

Pre-processing Optional for vertical-line removal
- Blurring Bilateral Filtering
- Otsu's thresholding
- Morphological erosion & dilation
- Smoothing (Averaging)
- Static thresholding
Nearest-Neighbor Clustering and Docstrum Plot
Spacing and Orientation Estimation
Determination of Text-lines
Structural Block Determination
Post-processing
- TBD

Evaluation

Citing Docstrum

O'Gorman, L., 1993. The document spectrum for page layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11), pp.1162-1173. pdf.

@article{o1993document,
  title={The document spectrum for page layout analysis},
  author={O'Gorman, Lawrence},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={15},
  number={11},
  pages={1162--1173},
  year={1993},
  publisher={IEEE}
}

Notes

How to remove .DS_Store

find . -name '.DS_Store' -type f -delete

docstrum

Related tags

Overview

Docstrum Algorithm

Getting Started

Disclaimer

Objective

Dependencies

Process

Evaluation

Citing Docstrum

Notes

How to remove .DS_Store

Owner

Chulwoo Mike Pack

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

MeshToGeotiff - A fast Python algorithm to convert a 3D mesh into a GeoTIFF

Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

Text layer for bio-image annotation.

Slice a single image into multiple pieces and create a dataset from them

OpenCV-Erlang/Elixir bindings

OCR, Object Detection, Number Plate, Real Time

Text recognition (optical character recognition) with deep learning methods.

PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks?

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Dirty, ugly, and hopefully useful OCR of Facebook Papers docs released by Gizmodo

Polaris is a Face recognition attendance system .

OCR software for recognition of handwritten text

Motion Detection Squid Game with OpenCV Python

An application of high resolution GANs to dewarp images of perturbed documents

Vietnamese Language Detection and Recognition

Application that instantly translates sign-language to letters.

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition