DUE: End-to-End Document Understanding Benchmark

Overview

This is the repository that provide tools to download data, reproduce the baseline results and evaluation.

What can you achieve with this guide

Based on this repository, you may be able to:

  1. download data for benchmark in a unified format.
  2. run all the baselines.
  3. evaluate already trained baseline models.

Install benchmark-related repositories

Start the container:

sudo userdocker run nvcr.io/nvidia/pytorch:20.12-py3

Clone the repo with:

git clone [email protected]:due-benchmark/baselines.git

Install the requirements:

pip install -e .

1. Download datasets and the base model

The datasets are re-hosted on the https://duebenchmark.com/data and can be downloaded from there. Moreover, since the baselines are finetuned based on the T5 model, you need to download the original model. Again it is re-hosted at https://duebenchmark.com/data. Please place it into the due_benchmark_data directory after downloading.

TODO: dopisać resztę

2. Run baseline trainings

2.1 Process datasets into memmaps (binarization)

In order to process datasets into memmaps, set the directory downloaded_data_path to downloaded data, set memmap_directory to a new directory that will store binarized datas, and use the following script:

./create_memmaps.sh

2.2 Run training script

Single training can be started with the following command, assuming out_dir is set as an output for the trained model's checkpoints and generated outputs. Additionally, set datas to any of the previously generated datasets (e.g., to DeepForm).

python benchmarker/cli/l5/train.py \
    --model_name_or_path ${downloaded_data_path}/t5-base \
    --relative_bias_args="[{\"type\":\"1d\"}]" \
    --dropout_rate 0.15 \
    --model_type=t5 \
    --output_dir ${out_dir} \
    --data_dir ${memmap_directory}/${datas}_memmap/train \
    --val_data_dir ${memmap_directory}/${datas}_memmap/dev \
    --test_data_dir ${memmap_directory}/${datas}_memmap/test \
    --gpus 1 \
    --max_epochs 30 \
    --train_batch_size 1 \
    --eval_batch_size 2 \
    --overwrite_output_dir \
    --accumulate_grad_batches 64 \
    --max_source_length 1024 \
    --max_target_length 256 \
    --eval_max_gen_length 16 \
    --learning_rate 2e-4 \
    --lr_scheduler constant \
    --warmup_steps 100 \
    --trim_batches \ 
    --do_train \
    --do_predict \ 
    --additional_data_fields doc_id label_name \
    --early_stopping_patience 20 \
    --segment_levels tokens pages \
    --optimizer adamw \
    --weight_decay 1e-5 \
    --adam_epsilon 1e-8 \
    --num_workers 4 \
    --val_check_interval 1

The models presented in the paper differs only in two places. The first is the choice of --relative_bias_args. T5 uses [{'type': '1d'}] whereas both +2D and +DALL-E use [{'type': '1d'}, {'type': 'horizontal'}, {'type': 'vertical'}]

Moreover +DALL-E had --context_embeddings set to [{'dimension': 1024, 'use_position_bias': False, 'embedding_type': 'discrete_vae', 'pretrained_path': '', 'image_width': 256, 'image_height': 256}]

3. Evaluate

3.1 Convert output to the submission file

In order to compare two files (generated by the model with the provided library and the gold-truth answers), one has to convert the generated output into a format that can be directly compared with documents.jsonl. Please use:

python to_submission_file.py ${downloaded_data_path} ${out_dir}

3.2 Evaluate reproduced models

Finally outputs can be evaluated using the provided evaluator. First, get back into main directory, where this README.md is placed and install it by cd due_evaluator-master && pip install -r requirement And run:

python due_evaluator --out-files baselines/test_generations.jsonl --reference ${downloaded_data_path}/DeepForm

3.3 Evaluate baseline outputs

We provide an examples of outputs generated by our baseline (DeepForm). They should be processed with:

python benchmarker-code/to_submission_file.py ${downloaded_data_path}/model_outputs_example ${downloaded_data_path}
python due_evaluator --out-files ./benchmarker/cli/l5/baselines/test_generations.txt.jsonl --reference ${downloaded_data_path}/DeepForm/test/document.jsonl

The expected output should be:

       Label       F1  Precision   Recall
  advertiser 0.512909   0.513793 0.512027
contract_num 0.778761   0.780142 0.777385
 flight_from 0.794376   0.795775 0.792982
   flight_to 0.804921   0.806338 0.803509
gross_amount 0.355476   0.356115 0.354839
         ALL 0.649771   0.650917 0.648630
Fusion-in-Decoder Distilling Knowledge from Reader to Retriever for Question Answering

This repository contains code for: Fusion-in-Decoder models Distilling Knowledge from Reader to Retriever Dependencies Python 3 PyTorch (currently tes

Meta Research 323 Dec 19, 2022
Linear algebra python - Number of operations and problems in Linear Algebra and Numerical Linear Algebra

Linear algebra in python Number of operations and problems in Linear Algebra and

Alireza 5 Oct 09, 2022
official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu

77 Dec 27, 2022
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 02, 2021
Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch

Segformer - Pytorch Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch. Install $ pip install segformer-pytorch

Phil Wang 208 Dec 25, 2022
Unofficial keras(tensorflow) implementation of MAE model from Masked Autoencoders Are Scalable Vision Learners

MAE-keras Unofficial keras(tensorflow) implementation of MAE model described in 'Masked Autoencoders Are Scalable Vision Learners'. This work has been

Yewon 11 Jun 12, 2022
Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.

Off-Policy Multi-Agent Reinforcement Learning (MARL) Algorithms This repository contains implementations of various off-policy multi-agent reinforceme

183 Dec 28, 2022
This repository contains a Ruby API for utilizing TensorFlow.

tensorflow.rb Description This repository contains a Ruby API for utilizing TensorFlow. Linux CPU Linux GPU PIP Mac OS CPU Not Configured Not Configur

somatic labs 825 Dec 26, 2022
Personal implementation of paper "Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval"

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval This repo provides personal implementation of paper Approximate Ne

John 8 Oct 07, 2022
Generative vs Discriminative: Rethinking The Meta-Continual Learning (NeurIPS 2021)

Generative vs Discriminative: Rethinking The Meta-Continual Learning (NeurIPS 2021) In this repository we provide PyTorch implementations for GeMCL; a

4 Apr 15, 2022
Citation Intent Classification in scientific papers using the Scicite dataset an Pytorch

Citation Intent Classification Table of Contents About the Project Built With Installation Usage Acknowledgments About The Project Citation Intent Cla

Federico Nocentini 4 Mar 04, 2022
Framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample resolution

Sample-specific Bayesian Networks A framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample or per-patient re

Caleb Ellington 1 Sep 23, 2022
Official code repository for the EMNLP 2021 paper

Integrating Visuospatial, Linguistic and Commonsense Structure into Story Visualization PyTorch code for the EMNLP 2021 paper "Integrating Visuospatia

Adyasha Maharana 23 Dec 19, 2022
Learning with Subset Stacking

Learning with Subset Stacking (LESS) LESS is a new supervised learning algorithm that is based on training many local estimators on subsets of a given

S. Ilker Birbil 19 Oct 04, 2022
Management Dashboard for Torchserve

Torchserve Dashboard Torchserve Dashboard using Streamlit Related blog post Usage Additional Requirement: torchserve (recommended:v0.5.2) Simply run:

Ceyda Cinarel 103 Dec 10, 2022
A Keras implementation of YOLOv3 (Tensorflow backend)

keras-yolo3 Introduction A Keras implementation of YOLOv3 (Tensorflow backend) inspired by allanzelener/YAD2K. Quick Start Download YOLOv3 weights fro

7.1k Jan 03, 2023
My take on a practical implementation of Linformer for Pytorch.

Linformer Pytorch Implementation A practical implementation of the Linformer paper. This is attention with only linear complexity in n, allowing for v

Peter 349 Dec 25, 2022
A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.

BraVe This is a JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short. The model provided in this package wa

DeepMind 44 Nov 20, 2022
Deep-Learning-Book-Chapter-Summaries - Attempting to make the Deep Learning Book easier to understand.

Deep-Learning-Book-Chapter-Summaries This repository provides a summary for each chapter of the Deep Learning book by Ian Goodfellow, Yoshua Bengio an

Aman Dalmia 1k Dec 27, 2022
Cervix ROI Segmentation Using U-NET

Cervix ROI Segmentation Using U-NET Overview This code illustrate how to segment the ROI in cervical images using U-NET. The ROI here meant to include

Scotty Kwok 35 Sep 14, 2022