PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020

Overview

PERIN: Permutation-invariant Semantic Parsing

David Samuel & Milan Straka

Charles University
Faculty of Mathematics and Physics
Institute of Formal and Applied Linguistics


Paper
Pretrained models
Interactive demo on Google Colab

Overall architecture



PERIN is a universal sentence-to-graph neural network architecture modeling semantic representation from input sequences.

The main characteristics of our approach are:

  • Permutation-invariant model: PERIN is, to our best knowledge, the first graph-based semantic parser that predicts all nodes at once in parallel and trains them with a permutation-invariant loss function.
  • Relative encoding: We present a substantial improvement of relative encoding of node labels, which allows the use of a richer set of encoding rules.
  • Universal architecture: Our work presents a general sentence-to-graph pipeline adaptable for specific frameworks only by adjusting pre-processing and post-processing steps.

Our model was ranked among the two winning systems in both the cross-framework and the cross-lingual tracks of MRP 2020 and significantly advanced the accuracy of semantic parsing from the last year's MRP 2019.



This repository provides the official PyTorch implementation of our paper "ÚFAL at MRP 2020: Permutation-invariant Semantic Parsing in PERIN" together with pretrained base models for all five frameworks from MRP 2020: AMR, DRG, EDS, PTG and UCCA.



How to run

🐾   Clone repository and install the Python requirements

git clone https://github.com/ufal/perin.git
cd perin

pip3 install -r requirements.txt 
pip3 install git+https://github.com/cfmrp/mtool.git#egg=mtool

🐾   Download and pre-process the dataset

Download the treebanks into ${data_dir} and split the cross-lingual datasets into training and validation parts by running:

./scripts/split_dataset.sh "path_to_a_dataset.mrp"

Preprocess and cache the dataset (computing the relative encodings can take up to several hours):

python3 preprocess.py --config config/base_amr.yaml --data_directory ${data_dir}

You should also download CzEngVallex if you are going to parse PTG:

curl -O https://lindat.mff.cuni.cz/repository/xmlui/bitstream/handle/11234/1-1512/czengvallex.zip
unzip czengvallex.zip
rm frames_pairs.xml czengvallex.zip

🐾   Train

To train a shared model for the English and Chinese AMR, run the following script. Other configurations are located in the config folder.

python3 train.py --config config/base_amr.yaml --data_directory ${data_dir} --save_checkpoints --log_wandb

Note that the companion file in needed only to provide the lemmatized forms, so it's also possible to train without it (but that will most likely negatively influence the accuracy of label prediction) -- just set the companion paths to None.

🐾   Inference

You can run the inference on the validation and test datasets by running:

python3 inference.py --checkpoint "path_to_pretrained_model.h5" --data_directory ${data_dir}

Citation

@inproceedings{Sam:Str:20,
  author = {Samuel, David and Straka, Milan},
  title = {{{\'U}FAL} at {MRP}~2020:
           {P}ermutation-Invariant Semantic Parsing in {PERIN}},
  booktitle = CONLL:20:U,
  address = L:CONLL:20,
  pages = {\pages{--}{53}{64}},
  year = 2020
}
Owner
ÚFAL
Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
ÚFAL
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language (NeurIPS 2021)

VRDP (NeurIPS 2021) Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language Mingyu Ding, Zhenfang Chen, Tao Du, Pin

Mingyu Ding 36 Sep 20, 2022
This repository contains code and data for "On the Multimodal Person Verification Using Audio-Visual-Thermal Data"

trimodal_person_verification This repository contains the code, and preprocessed dataset featured in "A Study of Multimodal Person Verification Using

ISSAI 7 Aug 31, 2022
Benchmark library for high-dimensional HPO of black-box models based on Weighted Lasso regression

LassoBench LassoBench is a library for high-dimensional hyperparameter optimization benchmarks based on Weighted Lasso regression. Note: LassoBench is

Kenan Šehić 5 Mar 15, 2022
Tensor-Based Quantum Machine Learning

TensorLy_Quantum TensorLy-Quantum is a Python library for Tensor-Based Quantum Machine Learning that builds on top of TensorLy and PyTorch. Website: h

TensorLy 85 Dec 03, 2022
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI

Yizhi Wang 17 Dec 22, 2022
Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

WSDEC This is the official repo for our NeurIPS paper Weakly Supervised Dense Event Captioning in Videos. Description Repo directories ./: global conf

Melon(Xuguang Duan) 96 Nov 01, 2022
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

111 Dec 27, 2022
Acoustic mosquito detection code with Bayesian Neural Networks

HumBugDB Acoustic mosquito detection with Bayesian Neural Networks. Extract audio or features from our large-scale dataset on Zenodo. This repository

31 Nov 28, 2022
HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision

HugsVision is an open-source and easy to use all-in-one huggingface wrapper for computer vision. The goal is to create a fast, flexible and user-frien

Labrak Yanis 166 Nov 27, 2022
Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers.

Less is More: Pay Less Attention in Vision Transformers Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers. By

73 Jan 01, 2023
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Dec 30, 2022
RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition (PyTorch) Paper: https://arxiv.org/abs/2105.01883 Citation: @

260 Jan 03, 2023
Code for ICLR2018 paper: Improving GAN Training via Binarized Representation Entropy (BRE) Regularization - Y. Cao · W Ding · Y.C. Lui · R. Huang

code for "Improving GAN Training via Binarized Representation Entropy (BRE) Regularization" (ICLR2018 paper) paper: https://arxiv.org/abs/1805.03644 G

21 Oct 12, 2020
[WACV 2022] Contextual Gradient Scaling for Few-Shot Learning

CxGrad - Official PyTorch Implementation Contextual Gradient Scaling for Few-Shot Learning Sanghyuk Lee, Seunghyun Lee, and Byung Cheol Song In WACV 2

Sanghyuk Lee 4 Dec 05, 2022
An easier way to build neural search on the cloud

An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g

Jina AI 17k Jan 02, 2023
Rule Based Classification Project

Kural Tabanlı Sınıflandırma ile Potansiyel Müşteri Getirisi Hesaplama İş Problemi: Bir oyun şirketi müşterilerinin bazı özelliklerini kullanaraknseviy

Şafak 1 Jan 12, 2022
Exadel CompreFace is a free and open-source face recognition GitHub project

Exadel CompreFace is a leading free and open-source face recognition system Exadel CompreFace is a free and open-source face recognition service that

Exadel 2.6k Jan 04, 2023
Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

111 Dec 29, 2022
Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

MSAD Multi-Scale Aligned Distillation for Low-Resolution Detection Lu Qi*, Jason Kuen*, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya J

Jia Research Lab 115 Dec 23, 2022
[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

Efficient Graph Similarity Computation - (EGSC) This repo contains the source code and dataset for our paper: Slow Learning and Fast Inference: Effici

24 Dec 31, 2022