Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

Last update: Dec 08, 2022

Overview

LayoutAnalysisEvaluator

Layout Analysis Evaluator for:

ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records
ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

Minimal usage: java -jar LayoutAnalysisEvaluator.jar -gt gt_image.png -p prediction_image.png

Parameters list: utility-name

 -gt,--groundTruth <arg>      Ground Truth image 
 -p,--prediction <arg>        Prediction image 
 -o,--original <arg>          (Optional) Original image, to be overlapped with the results visualization
 -j,--json <arg>              (Optional) Json Path, for the DIVAServices JSON output
 -out,--outputPath <arg>      (Optional) Output path (relative to prediction input path)                            
 -dv,--disableVisualization   (Optional)(Flag) Vsualizing the evaluation as image is NOT desired

Note: this also outputs a human-friendly visualization of the results next to the prediction_image.png which can be overlapped to the original image if provided with the parameter -overlap to enable deeper analysis.

Visualization of the results

Along with the numerical results (such as the Intersection over Union (IU), precision, recall,F1) the tool provides a human friendly visualization of the results. Additionally, when desired one can provide the original image and it will be overlapped with the visualization of the results. This is particularly helpful to understand why certain artifacts are created. The three images below represent the three steps: the original image, the visualization of the result and the two overlapped.

Interpreting the colors

Pixel colors are assigned as follows:

GREEN: Foreground predicted correctly
YELLOW: Foreground predicted - but the wrong class (e.g. Text instead of Comment)
BLACK: Background predicted correctly
RED: Background mis-predicted as Foreground
BLUE: Foreground mis-predicted as Background

Example of problem hunting

Below there is an example supporting the usefulness of overlapping the prediction quality visualization with the original image. Focus on the red pixels pointed at by the white arrow: they are background pixels mis-classified as foreground. In the normal visualization (left) its not possible to know why would an algorithm decide that in that spot there is something belonging to foreground, as it is clearly far from regular text. However, when overlapped with the original image (right) one can clearly see that in this area there is an ink stain which could explain why the classification algorithm is deceived into thinking these pixel were foreground. This kind of interpretation is obviously not possible without the information provided by the original image like in (right).

Ground Truth Format

The ground truth information needs to be a pixel-label image where the class information is encoded in the blue channel. Red and green channels should be set to 0 with the exception of the boundaries pixels used in the two competitions mentioned above.

For example, in the DIVA-HisDB dataset there are four different annotated classes which might overlap: main text body, decorations, comments and background.

In the pixel-label images the classes are encoded by RGB values as follows:

Red = 0 everywhere (except boundaries)
Green = 0 everywhere

Blue = 0b00...1000 = 0x000008: main text body
Blue = 0b00...0100 = 0x000004: decoration
Blue = 0b00...0010 = 0x000002: comment
Blue = 0b00...0001 = 0x000001: background (out of page)

Note that the GT might contain multi-class labeled pixels, for all classes except for the background. For example:

Blue = 0b...1000 | 0b...0010 = 0b...1010 = 0x00000A : main text body + comment  
Blue = 0b...1000 | 0b...0100 = 0b...1100 = 0x00000C : main text body + decoration
Blue = 0b...0010 | 0b...0100 = 0b...0110 = 0x000006 : comment + decoration

Citing us

If you use our software, please cite our paper as:

@inproceedings{alberti2017evaluation,
    address = {Kyoto, Japan},
    archivePrefix = {arXiv},
    arxivId = {1712.01656},
    author = {Alberti, Michele and Bouillon, Manuel and Ingold, Rolf and Liwicki, Marcus},
    booktitle = {2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},
    doi = {10.1109/ICDAR.2017.311},
    eprint = {1712.01656},
    isbn = {978-1-5386-3586-5},
    month = {nov},
    pages = {43--47},
    title = {{Open Evaluation Tool for Layout Analysis of Document Images}},
    year = {2017}
}

Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

3.2k Dec 31, 2022

CellProfiler is a open-source application for biological image analysis

CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically.

732 Dec 23, 2022

Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

3.2k Dec 31, 2022

Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

22 Dec 8, 2022

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

TableMASTER-mmocr Contents About The Project Method Description Dependency Getting Started Prerequisites Installation Usage Data preprocess Train Infe

298 Dec 21, 2022

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This project releases our 1st place solution on ICDAR 2021 Competition on Mathematical Formula Detection. We implement our solution based on MMDetection, which is an open source object detection toolbox based on PyTorch.

94 Dec 25, 2022

Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Damast This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval

University of Stuttgart Visualization Research Center

2 Jul 1, 2022

Layout Parser is a deep learning based tool for document image layout analysis tasks.

A Python Library for Document Layout Understanding

3.4k Dec 30, 2022

G-Research-Crypto-Competition - Project for passing the ML exam. Dataset took from the competition on the kaggle

G-Research-Crypto-Competition Project for passing the ML exam. Dataset took from

5 Jan 9, 2022

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

198 Dec 27, 2022

Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

🐯 SynthTIGER: Synthetic Text Image GEneratoR Official implementation of SynthTIGER | Paper | Datasets Moonbin Yim1, Yoonsik Kim1, Han-cheol Cho1, Sun

256 Jan 5, 2023

Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

87 Dec 27, 2022

A mathematica expression evaluator with PokemonTypes

A simple mathematical expression evaluator that uses Pokemon types to replace symbols.

2 Nov 14, 2021

The evaluator covering all of the metrics required by tasks within the DUE Benchmark.

DUE Evaluator The repository contains the evaluator covering all of the metrics required by tasks within the DUE Benchmark, i.e., set-based F1 (for KI

4 Jan 21, 2022

Excel-report-evaluator - A simple Python GUI application to aid with bulk evaluation of Microsoft Excel reports.

Excel Report Evaluator Simple Python GUI with Tkinter for evaluating Microsoft Excel reports (.xlsx-Files). Usage Start main.py and choose one of the

1 Dec 29, 2021

Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submits the contract address to TokenSniffer to evaluate contract legitimacy

14 Dec 9, 2022

Binance Smart Chain Contract Scraper + Contract Evaluator

14 Dec 9, 2022

Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu,

GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan

70 Dec 18, 2022

IMGUR5K handwriting set. It is a handwritten in-the-wild dataset, which contains challenging real world handwritten samples from different writers.The dataset is shared as a set of image urls with annotations. This code downloads the images and verifies the hash to the image to avoid data contamination.

IMGUR5K Handwriting Dataset To run the code for downloading the urls and generate corresponding annotations : Usage: python download_imgur5k.py --data

213 Dec 26, 2022

Releases(v1.0.0)

v1.0.0(Aug 4, 2017)

Version of the evaluation tool that was used for the ICDAR2017 competition on Layout Analysis for Challenging Medieval Manuscripts
Source code(tar.gz)
Source code(zip)
LayoutAnalysisEvaluator.jar(4.75 MB)

Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

Related tags

Overview

LayoutAnalysisEvaluator

Visualization of the results

Interpreting the colors

Example of problem hunting

Ground Truth Format

Citing us

You might also like...

Python-based tools for document analysis and OCR

CellProfiler is a open-source application for biological image analysis

Python-based tools for document analysis and OCR

Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Layout Parser is a deep learning based tool for document image layout analysis tasks.

G-Research-Crypto-Competition - Project for passing the ML exam. Dataset took from the competition on the kaggle

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

A mathematica expression evaluator with PokemonTypes

The evaluator covering all of the metrics required by tasks within the DUE Benchmark.

Excel-report-evaluator - A simple Python GUI application to aid with bulk evaluation of Microsoft Excel reports.

Binance Smart Chain Contract Scraper + Contract Evaluator

Binance Smart Chain Contract Scraper + Contract Evaluator

Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Releases(v1.0.0)

v1.0.0(Aug 4, 2017)

Owner

Distort a video using Seam Carving (video) and Vibrato effect (sound)

A simple component to display annotated text in Streamlit apps.

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks?

Official code for ROCA: Robust CAD Model Retrieval and Alignment from a Single Image (CVPR 2022)

Using python libraries to track hands

A community-supported supercharged version of paperless: scan, index and archive all your physical documents

This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"

Color Picker and Color Detection tool for METR4202

document image degradation

基于Paddle框架的PSENet复现

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Deep Learning Chinese Word Segment

Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

kaldi-asr/kaldi is the official location of the Kaldi project.

scantailor - Scan Tailor is an interactive post-processing tool for scanned pages.

[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

PyNeuro is designed to connect NeuroSky's MindWave EEG device to Python and provide Callback functionality to provide data to your application in real time.

FastOCR is a desktop application for OCR API.