CausaLM: Causal Model Explanation Through Counterfactual Language Models

Last update: Jul 10, 2022

Overview

CausaLM: Causal Model Explanation Through Counterfactual Language Models

Authors:

Amir Feder, Nadav Oved, Uri Shalit, Roi Reichart

Abstract:

Understanding predictions made by deep neural networks is notoriously difficult, but also crucial to their dissemination. As all ML-based methods, they are as good as their training data, and can also capture unwanted biases. While there are tools that can help understand whether such biases exist, they do not distinguish between correlation and causation, and might be ill-suited for text-based models and for reasoning about high level language concepts. A key problem of estimating the causal effect of a concept of interest on a given model is that this estimation requires the generation of counterfactual examples, which is challenging with existing generation technology. To bridge that gap, we propose CausaLM, a framework for producing causal model explanations using counterfactual language representation models. Our approach is based on fine-tuning of deep contextualized embedding models with auxiliary adversarial tasks derived from the causal graph of the problem. Concretely, we show that by carefully choosing auxiliary adversarial pre-training tasks, language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest, and be used to estimate its true causal effect on model performance. A byproduct of our method is a representation that is unaffected by the tested concept, which can be useful in mitigating unwanted bias ingrained in the data.

CausaLM: Causal Model Explanation Through Counterfactual Language Models

Related tags

Overview

CausaLM: Causal Model Explanation Through Counterfactual Language Models

Authors:

Amir Feder, Nadav Oved, Uri Shalit, Roi Reichart

Abstract:

Links:

Paper

Code

Data

Owner

Amir Feder

Official Pytorch implementation of paper "Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Generated Images"

TJU Deep Learning & Neural Network

PyTorch implementation of the implicit Q-learning algorithm (IQL)

Extreme Dynamic Classifier Chains - XGBoost for Multi-label Classification

Some toy examples of score matching algorithms written in PyTorch

HGCAE Pytorch implementation. CVPR2021 accepted.

Official PyTorch implementation of "Adversarial Reciprocal Points Learning for Open Set Recognition"

AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised Anime Face Generation

Simple-Image-Classification - Simple Image Classification Code (PyTorch)

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

mmfewshot is an open source few shot learning toolbox based on PyTorch

PyTorchVideo is a deeplearning library with a focus on video understanding work

Rethinking Portrait Matting with Privacy Preserving

This is an official implementation for "PlaneRecNet".

Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis in JAX

Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

Implementations of CNNs, RNNs, GANs, etc

A minimalist tool to display a network graph.

CLADE - Efficient Semantic Image Synthesis via Class-Adaptive Normalization (TPAMI 2021)

Repository of the paper Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models at ML4AD @ NeurIPS 2021.