Improving Non-autoregressive Generation with Mixup Training

Related tags

Deep LearningMIST
Overview

MIST

Training MIST

TRAIN_FILE=/your/path/to/train.json
VALID_FILE=/your/path/to/valid.json
OUTPUT_DIR=/your/path/to/save_checkpoints
CACHE_DIR=/your/path/to/transformer_package_cache

MODEL_PATH=bert-base-uncased or models/unilm1.2-base-uncased

# squadqg 30005 steps
# squadqg 50005 steps
# xsum 600005 steps
STEPS=30005

python -m torch.distributed.launch --nproc_per_node=4 train.py\
  --train_file $TRAIN_FILE\
  --valid_file $VALID_FILE\
  --output_dir $OUTPUT_PATH\
  --model_type nat --model_name_or_path $MODEL_PATH\
  --do_lower_case --max_source_seq_length 464 --max_target_seq_length 48\
  --per_gpu_train_batch_size 16 --gradient_accumulation_steps 1\
  --learning_rate 3e-5 --num_warmup_steps 500 --num_training_steps $STEPS\
  --cache_dir $CACHE_DIR\
  --log_dir ${OUTPUT_PATH}/log\
  --keep_prob 0.0\
  --random_prob 0.0\
  --use_glat\
  --tqdm_miniters 100\
  --cotrain_put_target_in_source\ 
  --cotrain_put_target_in_source_same_bert\ 
  --wandb\ # logging with wandb
  --fp16\
  --fp16_opt_level O2

Removing the cotrain_put_target_in_source and cotrain_put_target_in_source_same_bert flags to reproduce the results without MIST.

Download Unilm

mkdir -p models/unilm1.2-base-uncased
cd models/unilm1.2-base-uncased
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased.bin -O pytorch_model.bin
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased-vocab.txt -O vocab.txt
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased-config.json -O config.json

Download datasets

Json dataset links: squadqg, xsum and quora

Training NAT MASS

To reproduce the results of NAT MASS, please refer to the ./MASS-NAT/mass-nat.sh

Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset

Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset

Clay Mullis 82 Oct 13, 2022
[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study Codes for [Preprint] Bag of Tricks for Training Deeper Graph

VITA 101 Dec 29, 2022
Generative Adversarial Networks(GANs)

Generative Adversarial Networks(GANs) Vanilla GAN ClusterGAN Vanilla GAN Model Structure Final Generator Structure A MLP with 2 hidden layers of hidde

Zhenbang Feng 2 Nov 05, 2021
Off-policy continuous control in PyTorch, with RDPG, RTD3 & RSAC

arXiv technical report soon available. we are updating the readme to be as comprehensive as possible Please ask any questions in Issues, thanks. Intro

Zhihan 31 Dec 30, 2022
Learning to Prompt for Continual Learning

Learning to Prompt for Continual Learning (L2P) Official Jax Implementation L2P is a novel continual learning technique which learns to dynamically pr

Google Research 207 Jan 06, 2023
Losslandscapetaxonomy - Taxonomizing local versus global structure in neural network loss landscapes

Taxonomizing local versus global structure in neural network loss landscapes Int

Yaoqing Yang 8 Dec 30, 2022
SlideGraph+: Whole Slide Image Level Graphs to Predict HER2 Status in Breast Cancer

SlideGraph+: Whole Slide Image Level Graphs to Predict HER2 Status in Breast Cancer A novel graph neural network (GNN) based model (termed SlideGraph+

28 Dec 24, 2022
CVPR '21: In the light of feature distributions: Moment matching for Neural Style Transfer

In the light of feature distributions: Moment matching for Neural Style Transfer (CVPR 2021) This repository provides code to recreate results present

Nikolai Kalischek 49 Oct 13, 2022
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016

Temporal Segment Networks (TSN) We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation fo

1.4k Jan 01, 2023
This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"

Learning Invariant Representation for Unsupervised Image Restoration (CVPR 2020) Introduction This is an implementation for the paper "Learning Invari

GarField 88 Nov 07, 2022
Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations

TopClus The source code used for Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations, published in WWW 2022. Requ

Yu Meng 63 Dec 18, 2022
[AI6122] Text Data Management & Processing

[AI6122] Text Data Management & Processing is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instruc

HT. Li 1 Jan 17, 2022
A model that attempts to learn and benefit from data collected on card counting.

A model that attempts to learn and benefit from data collected on card counting. A decision tree like model is built to win more often than loose and increase the bet of the player appropriately to c

1 Dec 17, 2021
Project to create an open-source 6 DoF input device

6DInputs A Project to create open-source 3D printed 6 DoF input devices Note the plural ('6DInputs' and 'devices') in the headings. We would like seve

RepRap Ltd 47 Jul 28, 2022
Code repository for the paper "Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation" with instructions to reproduce the results.

Doubly Trained Neural Machine Translation System for Adversarial Attack and Data Augmentation Languages Experimented: Data Overview: Source Target Tra

Steven Tan 1 Aug 18, 2022
Official code repository for the EMNLP 2021 paper

Integrating Visuospatial, Linguistic and Commonsense Structure into Story Visualization PyTorch code for the EMNLP 2021 paper "Integrating Visuospatia

Adyasha Maharana 23 Dec 19, 2022
A small fun project using python OpenCV, mediapipe, and pydirectinput

Here I tried a small fun project using python OpenCV, mediapipe, and pydirectinput. Here we can control moves car game when yellow color come to right box (press key 'd') left box (press key 'a') lef

Sameh Elisha 3 Nov 17, 2022
Not Suitable for Work (NSFW) classification using deep neural network Caffe models.

Open nsfw model This repo contains code for running Not Suitable for Work (NSFW) classification deep neural network Caffe models. Please refer our blo

Yahoo 5.6k Jan 05, 2023
LaBERT - A length-controllable and non-autoregressive image captioning model.

Length-Controllable Image Captioning (ECCV2020) This repo provides the implemetation of the paper Length-Controllable Image Captioning. Install conda

bearcatt 53 Nov 13, 2022
NeurIPS 2021, "Fine Samples for Learning with Noisy Labels"

[Official] FINE Samples for Learning with Noisy Labels This repository is the official implementation of "FINE Samples for Learning with Noisy Labels"

mythbuster 27 Dec 23, 2022