SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).

Last update: Dec 21, 2022

Overview

SPRING

This is the repo for SPRING (Symmetric ParsIng aNd Generation), a novel approach to semantic parsing and generation, presented at AAAI 2021.

With SPRING you can perform both state-of-the-art Text-to-AMR parsing and AMR-to-Text generation without many cumbersome external components. If you use the code, please reference this work in your paper:

@inproceedings{bevilacqua-etal-2021-one,
    title = {One {SPRING} to Rule Them Both: {S}ymmetric {AMR} Semantic Parsing and Generation without a Complex Pipeline},
    author = {Bevilacqua, Michele and Blloshmi, Rexhina and Navigli, Roberto},
    booktitle = {Proceedings of AAAI},
    year = {2021}
}

Pretrained Checkpoints

Here we release our best SPRING models which are based on the DFS linearization.

Text-to-AMR Parsing

Model trained in the AMR 2.0 training set: AMR2.parsing-1.0.tar.bz2
Model trained in the AMR 3.0 training set: AMR3.parsing-1.0.tar.bz2

AMR-to-Text Generation

Model trained in the AMR 2.0 training set: AMR2.generation-1.0.tar.bz2
Model trained in the AMR 3.0 training set: AMR3.generation-1.0.tar.bz2

If you need the checkpoints of other experiments in the paper, please send us an email.

Installation

cd spring
pip install -r requirements.txt
pip install -e .

The code only works with transformers < 3.0 because of a disrupting change in positional embeddings. The code works fine with torch 1.5. We recommend the usage of a new conda env.

Train

Modify config.yaml in configs. Instructions in comments within the file. Also see the appendix.

Text-to-AMR

python bin/train.py --config configs/config.yaml --direction amr

Results in runs/

AMR-to-Text

python bin/train.py --config configs/config.yaml --direction text

Results in runs/

Evaluate

Text-to-AMR

python bin/predict_amrs.py \
    --datasets <AMR-ROOT>/data/amrs/split/test/*.txt \
    --gold-path data/tmp/amr2.0/gold.amr.txt \
    --pred-path data/tmp/amr2.0/pred.amr.txt \
    --checkpoint runs/<checkpoint>.pt \
    --beam-size 5 \
    --batch-size 500 \
    --device cuda \
    --penman-linearization --use-pointer-tokens

gold.amr.txt and pred.amr.txt will contain, respectively, the concatenated gold and the predictions.

To reproduce our paper's results, you will also need need to run the BLINK entity linking system on the prediction file (data/tmp/amr2.0/pred.amr.txt in the previous code snippet). To do so, you will need to install BLINK, and download their models:

git clone https://github.com/facebookresearch/BLINK.git
cd BLINK
pip install -r requirements.txt
sh download_blink_models.sh
cd models
wget http://dl.fbaipublicfiles.com/BLINK//faiss_flat_index.pkl
cd ../..

Then, you will be able to launch the blinkify.py script:

python bin/blinkify.py \
    --datasets data/tmp/amr2.0/pred.amr.txt \
    --out data/tmp/amr2.0/pred.amr.blinkified.txt \
    --device cuda \
    --blink-models-dir BLINK/models

To have comparable Smatch scores you will also need to use the scripts available at https://github.com/mdtux89/amr-evaluation, which provide results that are around ~0.3 Smatch points lower than those returned by bin/predict_amrs.py.

AMR-to-Text

python bin/predict_sentences.py \
    --datasets <AMR-ROOT>/data/amrs/split/test/*.txt \
    --gold-path data/tmp/amr2.0/gold.text.txt \
    --pred-path data/tmp/amr2.0/pred.text.txt \
    --checkpoint runs/<checkpoint>.pt \
    --beam-size 5 \
    --batch-size 500 \
    --device cuda \
    --penman-linearization --use-pointer-tokens

gold.text.txt and pred.text.txt will contain, respectively, the concatenated gold and the predictions. For BLEU, chrF++, and Meteor in order to be comparable you will need to tokenize both gold and predictions using JAMR tokenizer. To compute BLEU and chrF++, please use bin/eval_bleu.py. For METEOR, use https://www.cs.cmu.edu/~alavie/METEOR/ . For BLEURT don't use tokenization and run the eval with https://github.com/google-research/bleurt. Also see the appendix.

Linearizations

The previously shown commands assume the use of the DFS-based linearization. To use BFS or PENMAN decomment the relevant lines in configs/config.yaml (for training). As for the evaluation scripts, substitute the --penman-linearization --use-pointer-tokens line with --use-pointer-tokens for BFS or with --penman-linearization for PENMAN.

License

This project is released under the CC-BY-NC-SA 4.0 license (see LICENSE). If you use SPRING, please put a link to this repo.

Acknowledgements

The authors gratefully acknowledge the support of the ERC Consolidator Grant MOUSSE No. 726487 and the ELEXIS project No. 731015 under the European Union’s Horizon 2020 research and innovation programme.

This work was supported in part by the MIUR under the grant "Dipartimenti di eccellenza 2018-2022" of the Department of Computer Science of the Sapienza University of Rome.

SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).

Related tags

Overview

SPRING

Pretrained Checkpoints

Text-to-AMR Parsing

AMR-to-Text Generation

Installation

Train

Text-to-AMR

AMR-to-Text

Evaluate

Text-to-AMR

AMR-to-Text

Linearizations

License

Acknowledgements

Owner

Sapienza NLP group

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

Jax/Flax implementation of Variational-DiffWave.

Semantic similarity computation with different state-of-the-art metrics

Styled Handwritten Text Generation with Transformers (ICCV 21)

Improving Machine Translation Systems via Isotopic Replacement

Differentiable Surface Triangulation

Implementation of several Bayesian multi-target tracking algorithms, including Poisson multi-Bernoulli mixture filters for sets of targets and sets of trajectories. The repository also includes the GOSPA metric and a metric for sets of trajectories to evaluate performance.

Awesome AI Learning with +100 AI Cheat-Sheets, Free online Books, Top Courses, Best Videos and Lectures, Papers, Tutorials, +99 Researchers, Premium Websites, +121 Datasets, Conferences, Frameworks, Tools

A selection of State Of The Art research papers (and code) on human locomotion (pose + trajectory) prediction (forecasting)

PyTorch implementation for 3D human pose estimation

PyTorch implementation of Tacotron speech synthesis model.

Official code for UnICORNN (ICML 2021)

Bottom-up Human Pose Estimation

Official Pytorch implementation for video neural representation (NeRV)

[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations

Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

SimBERT升级版（SimBERTv2）！

Code for "Diffusion is All You Need for Learning on Surfaces"

Analysis of Antarctica sequencing samples contaminated with SARS-CoV-2