MEND: Model Editing Networks using Gradient Decomposition

Setup

Environment

This codebase uses Python 3.7.9. Other versions may work as well.

Create a virtualenv (pyenv can help with this) and install the dependencies:

$ python -m venv env
$ source env/bin/activate
(env) $ pip install -r requirements.txt

Data

You can download the data needed for this project from this Google Drive link. Unzip each sub-directory into mend/data and you should be good to go.

Running the code

Run MEND training/evaluation for distilGPT-2 on the wikitext editing problem with:

(env) $ python -m run +alg=mend +experiment=gen +model=distilgpt2

Other valid algs include efk (KnowledgeEditor) and enn (Editable Neural Networks). Valid experiments include fc (FEVER fact checking) and qa (zsRE question-answering). Splits and rephrases for both come from De Cao et. al. Check config/model for options for editable models (note that all models don't work for all experiments; GPT-style models only work with gen, seq2seq models only work with qa, and BERT only works with fc).

Also note that in the paper, we sample locality data from different datasets depending on the model. By default, training will use Natural Questions data (not zsRE data) for computing drawdown in the qa experiment and OpenWebText. For models such as the distilgpt2 model we use (which was fine-tuned on wikitext) or the BART-base model, this behavior should be disabled with data.wiki_webtext=False or data.zsre_nq=False, respectively.

Citing the paper

If this code or paper was useful, please consider using the following citation:

@article{mitchell2021fast,
    title={Fast Model Editing at Scale},
    author={Mitchell, Eric and Lin, Charles and Bosselut, Antoine and Finn, Chelsea and Manning, Chris}
    year={2021}
}

MEND: Model Editing Networks using Gradient Decomposition

Related tags

Overview

MEND: Model Editing Networks using Gradient Decomposition

Setup

Environment

Data

Running the code

Citing the paper

Owner

Eric Mitchell

A Python training and inference implementation of Yolov5 helmet detection in Jetson Xavier nx and Jetson nano

Equivariant Imaging: Learning Beyond the Range Space

Code for "My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack" paper

Python based Advanced AI Assistant

Hypercomplex Neural Networks with PyTorch

fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

Improving the robustness and performance of biomedical NLP models through adversarial training

Tensor-Based Quantum Machine Learning

ExCon: Explanation-driven Supervised Contrastive Learning

using yolox+deepsort for object-tracker

CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images

Fast SHAP value computation for interpreting tree-based models

Keras Image Embeddings using Contrastive Loss

Simple, efficient and flexible vision toolbox for mxnet framework.

Pytorch Implementation of Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations

Learning to Estimate Hidden Motions with Global Motion Aggregation

Dashboard for the COVID19 spread

Evolving neural network parameters in JAX.

PyTorch implementation of the REMIND method from our ECCV-2020 paper "REMIND Your Neural Network to Prevent Catastrophic Forgetting"

PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset