Extracting knowledge graphs from language models as a diagnostic benchmark of model performance.

Last update: Oct 25, 2022

Overview

Interpreting Language Models Through Knowledge Graph Extraction

Idea: How do we interpret what a language model learns at various stages of training? Language models have been recently described as open knowledge bases. We can generate knowledge graphs by extracting relation triples from masked language models at sequential epochs or architecture variants to examine the knowledge acquisition process.

Dataset: Squad, Google-RE (3 flavors)

Models: BERT, RoBeRTa, DistilBert, training RoBERTa from scratch

Authors: Vinitra Swamy, Angelika Romanou, Martin Jaggi

This repository is the official implementation of the NeurIPS 2021 XAI4Debugging paper titled "Interpreting Language Models Through Knowledge Graph Extraction". Found this work useful? Please cite our paper.

Quick Start Guide

Pretrained Model (BERT, DistilBERT, RoBERTa) -> Knowlege Graph

Install requirements and clone repository

git clone https://github.com/epfml/interpret-lm-knowledge.git
pip install git+https://github.com/huggingface/transformers   
pip install textacy
cd interpret-lm-knowledge/scripts

Generate knowledge graphs and dataframes python run_knowledge_graph_experiments.py <dataset> <model> <use_spacy>
e.g. squad Bert spacy
e.g. re-place-birth Roberta

options:

dataset=squad - "squad", "re-place-birth", "re-date-birth", "re-place-death"  
model=Roberta - "Bert", "Roberta", "DistilBert"  
extractor=spacy - "spacy", "textacy", "custom"

See run_lm_experiments notebook for examples.

Train LM model from scratch -> Knowledge Graph

Install requirements and clone repository

!pip install git+https://github.com/huggingface/transformers
!pip list | grep -E 'transformers|tokenizers'
!pip install textacy

Run wikipedia_train_from_scratch_lm.ipynb.
As included in the last cell of the notebook, you can run the KG generation experiments by:

from run_training_kg_experiments import *
run_experiments(tokenizer, model, unmasker, "Roberta3e")

Citations

@inproceedings{swamy2021interpreting,
 author = {Swamy, Vinitra and Romanou, Angelika and Jaggi, Martin},
 booktitle = {Advances in Neural Information Processing Systems, Workshop on eXplainable AI Approaches for Debugging and Diagnosis},
 title = {Interpreting Language Models Through Knowledge Graph Extraction},
 volume = {35},
 year = {2021}
}

Extracting knowledge graphs from language models as a diagnostic benchmark of model performance.

Related tags

Overview

Interpreting Language Models Through Knowledge Graph Extraction

Quick Start Guide

Pretrained Model (BERT, DistilBERT, RoBERTa) -> Knowlege Graph

Train LM model from scratch -> Knowledge Graph

Citations

Owner

EPFL Machine Learning and Optimization Laboratory

Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling"

Official codebase used to develop Vision Transformer, MLP-Mixer, LiT and more.

Baselines for TrajNet++

Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

This is the workbook I created while I was studying for the Qiskit Associate Developer exam. I hope this becomes useful to others as it was for me :)

VOS: Learning What You Don’t Know by Virtual Outlier Synthesis

Demonstration of transfer of knowledge and generalization with distillation

How to train a CNN to 99% accuracy on MNIST in less than a second on a laptop

Multi-modal Content Creation Model Training Infrastructure including the FACT model (AI Choreographer) implementation.

The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Text-to-SQL"

Implement of homography net by pytorch

Semi-Supervised Semantic Segmentation with Cross-Consistency Training (CCT)

Transferable Unrestricted Attacks, which won 1st place in CVPR’21 Security AI Challenger: Unrestricted Adversarial Attacks on ImageNet.

A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).

CNN designed for pansharpening

In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

Dynamic Environments with Deformable Objects (DEDO)

Implementation of ICLR 2020 paper "Revisiting Self-Training for Neural Sequence Generation"

PASTRIE: A Corpus of Prepositions Annotated with Supersense Tags in Reddit International English

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)