CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

Last update: Dec 20, 2022

Overview

CUTIE

TensorFlow implementation of the paper "CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor." Xiaohui Zhao Paper Link

CUTIE 是用于“票据文档” 2D 关键信息提取/命名实体识别/槽位填充算法。使用CUTIE前，需先使用OCR算法对“票据文档” 中的文字执行检测和识别，而后将格式化的文本输入入CUTIE网络，具体流程可参照论文。

CUTIE can be considered as one type of 2-Dimensional Key Information Extraction, 2-D NER (Named Entity Recognition) or a 2-Dimensional 2D Slot Filling algorithm. Before training / inference with CUTIE, prepare your structured texts in your scanned document images with any type of OCR algorithm. Refer to the CUTIE paper for details about the procedure.

Results

Result evaluated on 4,484 receipt documents, including taxi receipts, meals entertainment receipts, and hotel receipts, with 9 different key information classes. (AP / softAP)

Method	#Params	Taxi	Hotel
CloudScan	-	82.0 / -	60.0 / -
BERT	110M	88.1 / -	71.7 / -
CUTIE	14M	94.0 / 97.3	74.6 / 87.0

Installation & Usage

pip install -r requirements.txt

Generate your own dictionary with main_build_dict.py / main_data_tokenizer.py
Train your model with main_train_json.py

CUTIE achieves best performance with rows/cols well configured. For more insights, refer to statistics in the file (others/TrainingStatistic.xlsx).

Others

For information about the input example, refer to issue discussion.

Apply any OCR tool that help you detecting and recognizing words in the scanned document image.
Label image OCR results with key information class as the .json file in the invoice_data folder. (thanks to @4kssoft)

CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

Related tags

Overview

CUTIE

Results

Installation & Usage

Others

Owner

Zhao,Xiaohui

Optical character recognition for Japanese text, with the main focus being Japanese manga

Official implementation of Character Region Awareness for Text Detection (CRAFT)

第一届西安交通大学人工智能实践大赛（2018AI实践大赛--图片文字识别）第一名；仅采用densenet识别图中文字

Bu uygulamada Python ve Opencv kullanarak bilgisayar kamerasından yüz tespiti yapıyoruz.

Msos searcher - A half-hearted attempt at finding a magic square of squares

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

GDB python tool to pretty print and debug c++ xtensor containers

governance proposal to make fei redeemable for eth

Table recognition inside douments using neural networks

SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

OCR powered screen-capture tool to capture information instead of images

Page to PAGE Layout Analysis Tool

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Single Shot Text Detector with Regional Attention

Python Computer Vision application that allows users to draw/erase on the screen using their webcam.

Document Image Dewarping

The world's simplest facial recognition api for Python and the command line

CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

A simple Digits Recogniser made in Python

A list of hyperspectral image super-solution resources collected by Junjun Jiang