Revisiting Self-Training for Few-Shot Learning of Language Model.

Last update: Nov 19, 2022

Related tags

Overview

SFLM

This is the implementation of the paper Revisiting Self-Training for Few-Shot Learning of Language Model. SFLM is short for self-training for few-shot learning of language model.

Requirements

To run our code, please install all the dependency packages by using the following command:

pip install -r requirements.txt

Preprocess

The original data can be found from LM-BFF. To generate data for the few-shot experiments, please run the below command:

python tools/generate_data.py

The original data shall be in ./data/original, and the sampled data will be in ./data/few-shot/$K-$MU-$SEED. Please refer to ./tools/generate_data.py for more options.

Train

Our code can be run as the below example:

python3 run.py \
  --task_name SST-2 \
  --data_dir data/few-shot/SST-2/16-4-100 \
  --do_train \
  --do_eval \
  --do_predict \
  --evaluate_during_training \
  --model_name_or_path roberta-base \
  --few_shot_type prompt-demo \
  --num_k 16 \
  --max_seq_length 256 \
  --per_device_train_batch_size 2 \
  --per_device_eval_batch_size 16 \
  --gradient_accumulation_steps 4 \
  --learning_rate 1e-5 \
  --max_steps 1000 \
  --logging_steps 100 \
  --eval_steps 100 \
  --num_train_epochs 0 \
  --output_dir result/SST-2-16-4-100 \
  --save_logit_dir result/SST-2-16-4-100 \
  --seed 100 \
  --template "*cls**sent_0*_It_was*mask*.*sep+*" \
  --mapping "{'0':'terrible','1':'great'}" \
  --num_sample 16 \
  --threshold 0.95 \
  --lam1 0.5 \
  --lam2 0.1

Most arguments are the same as LM-BFF, and the same manual prompts are used in our experiments. We list additional arguments used in SFLM:

threshold: The threshold used to filter out low-confidence samples for self-training loss
lam1: The weight of self-training loss
lam2: The weight of self-supervised loss

Citation

Please cite our paper if you use SFLM in your work:

@inproceedings{chen2021revisit,        
    title={Revisiting Self-Training for Few-Shot Learning of Language Model},         
    author={Chen, Yiming and Zhang, Yan and Zhang, Chen and Lee, Grandee and Cheng, Ran and Li, Haizhou},         
    booktitle={EMNLP},        
    year={2021},
}

Acknowledgements

Code is implemented based on LM-BFF. We would like to thank the authors of LM-BFF for making their code public.

Revisiting Self-Training for Few-Shot Learning of Language Model.

Related tags

Overview

SFLM

Requirements

Preprocess

Train

Citation

Acknowledgements

Owner

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

Blender Add-on that sets a Material's Base Color to one of Pantone's Colors of the Year

MRQy is a quality assurance and checking tool for quantitative assessment of magnetic resonance imaging (MRI) data.

Repository for RNNs using TensorFlow and Keras - LSTM and GRU Implementation from Scratch - Simple Classification and Regression Problem using RNNs

Learning to Self-Train for Semi-Supervised Few-Shot

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Code to produce syntactic representations that can be used to study syntax processing in the human brain

Segment axon and myelin from microscopy data using deep learning

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral)

Implementation of "Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification"

A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing"

Pytorch implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"

Block-wisely Supervised Neural Architecture Search with Knowledge Distillation (CVPR 2020)

Deep Inside Convolutional Networks - This is a caffe implementation to visualize the learnt model

Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

Aggragrating Nested Transformer Official Jax Implementation

TilinGNN: Learning to Tile with Self-Supervised Graph Neural Network (SIGGRAPH 2020)

Code for the paper: Sketch Your Own GAN

Official page of Struct-MDC (RA-L'22 with IROS'22 option); Depth completion from Visual-SLAM using point & line features