[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias

Overview

Counterfactual VQA (CF-VQA)

This repository is the Pytorch implementation of our paper "Counterfactual VQA: A Cause-Effect Look at Language Bias" in CVPR 2021. This code is implemented as a fork of RUBi.

CF-VQA is proposed to capture and mitigate language bias in VQA from the view of causality. CF-VQA (1) captures the language bias as the direct causal effect of questions on answers, and (2) reduces the language bias by subtracting the direct language effect from the total causal effect.

If you find this paper helps your research, please kindly consider citing our paper in your publications.

@inproceedings{niu2020counterfactual,
  title={Counterfactual VQA: A Cause-Effect Look at Language Bias},
  author={Niu, Yulei and Tang, Kaihua and Zhang, Hanwang and Lu, Zhiwu and Hua, Xian-Sheng and Wen, Ji-Rong},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2021}
}

Summary

Installation

1. Setup and dependencies

Install Anaconda or Miniconda distribution based on Python3+ from their downloads' site.

conda create --name cfvqa python=3.7
source activate cfvqa
pip install -r requirements.txt

2. Download datasets

Download annotations, images and features for VQA experiments:

bash cfvqa/datasets/scripts/download_vqa2.sh
bash cfvqa/datasets/scripts/download_vqacp2.sh

Quick start

Train a model

The boostrap/run.py file load the options contained in a yaml file, create the corresponding experiment directory and start the training procedure. For instance, you can train our best model on VQA-CP v2 (CFVQA+SUM+SMRL) by running:

python -m bootstrap.run -o cfvqa/options/vqacp2/smrl_cfvqa_sum.yaml

Then, several files are going to be created in logs/vqacp2/smrl_cfvqa_sum/:

  • [options.yaml] (copy of options)
  • [logs.txt] (history of print)
  • [logs.json] (batchs and epochs statistics)
  • [_vq_val_oe.json] (statistics for the language-prior based strategy, e.g., RUBi)
  • [_cfvqa_val_oe.json] (statistics for CF-VQA)
  • [_q_val_oe.json] (statistics for language-only branch)
  • [_v_val_oe.json] (statistics for vision-only branch)
  • [_all_val_oe.json] (statistics for the ensembled branch)
  • ckpt_last_engine.pth.tar (checkpoints of last epoch)
  • ckpt_last_model.pth.tar
  • ckpt_last_optimizer.pth.tar

Many options are available in the options directory. CFVQA represents the complete causal graph while cfvqas represents the simplified causal graph.

Evaluate a model

There is no test set on VQA-CP v2, our main dataset. The evaluation is done on the validation set. For a model trained on VQA v2, you can evaluate your model on the test set. In this example, boostrap/run.py load the options from your experiment directory, resume the best checkpoint on the validation set and start an evaluation on the testing set instead of the validation set while skipping the training set (train_split is empty). Thanks to --misc.logs_name, the logs will be written in the new logs_predicate.txt and logs_predicate.json files, instead of being appended to the logs.txt and logs.json files.

python -m bootstrap.run \
-o ./logs/vqacp2/smrl_cfvqa_sum/options.yaml \
--exp.resume last \
--dataset.train_split ''\
--dataset.eval_split val \
--misc.logs_name test 

Useful commands

Use a specific GPU

For a specific experiment:

CUDA_VISIBLE_DEVICES=0 python -m boostrap.run -o cfvqa/options/vqacp2/smrl_cfvqa_sum.yaml

For the current terminal session:

export CUDA_VISIBLE_DEVICES=0

Overwrite an option

The boostrap.pytorch framework makes it easy to overwrite a hyperparameter. In this example, we run an experiment with a non-default learning rate. Thus, I also overwrite the experiment directory path:

python -m bootstrap.run -o cfvqa/options/vqacp2/smrl_cfvqa_sum.yaml \
--optimizer.lr 0.0003 \
--exp.dir logs/vqacp2/smrl_cfvqa_sum_lr,0.0003

Resume training

If a problem occurs, it is easy to resume the last epoch by specifying the options file from the experiment directory while overwritting the exp.resume option (default is None):

python -m bootstrap.run -o logs/vqacp2/smrl_cfvqa_sum/options.yaml \
--exp.resume last

Acknowledgment

Special thanks to the authors of RUBi, BLOCK, and bootstrap.pytorch, and the datasets used in this research project.

Owner
Yulei Niu
Yulei Niu
Air Pollution Prediction System using Linear Regression and ANN

AirPollution Pollution Weather Prediction System: Smart Outdoor Pollution Monitoring and Prediction for Healthy Breathing and Living Publication Link:

Dr Sharnil Pandya, Associate Professor, Symbiosis International University 19 Feb 07, 2022
The repo for reproducing Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study

ECIR Reproducibility Paper: Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study This code corresponds to the reproducibility

ielab 3 Mar 31, 2022
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data [WIP] Unofficial Pytorch implementation of AdaSpeech 2. Requirements : All code written i

Rishikesh (ऋषिकेश) 63 Dec 28, 2022
Plug-n-Play Reinforcement Learning in Python with OpenAI Gym and JAX

coax is built on top of JAX, but it doesn't have an explicit dependence on the jax python package. The reason is that your version of jaxlib will depend on your CUDA version.

128 Dec 27, 2022
🌊 Online machine learning in Python

In a nutshell River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition

OnlineML 4k Jan 02, 2023
Open-L2O: A Comprehensive and Reproducible Benchmark for Learning to Optimize Algorithms

Open-L2O This repository establishes the first comprehensive benchmark efforts of existing learning to optimize (L2O) approaches on a number of proble

VITA 161 Jan 02, 2023
Source code to accompany Defunctland's video "FASTPASS: A Complicated Legacy"

Shapeland Simulator Source code to accompany Defunctland's video "FASTPASS: A Complicated Legacy" Download the video at https://www.youtube.com/watch?

TouringPlans.com 70 Dec 14, 2022
PyTorch implementation for NED. It can be used to manipulate the facial emotions of actors in videos based on emotion labels or reference styles.

Neural Emotion Director (NED) - Official Pytorch Implementation Example video of facial emotion manipulation while retaining the original mouth motion

Foivos Paraperas 89 Dec 23, 2022
🔎 Super-scale your images and run experiments with Residual Dense and Adversarial Networks.

Image Super-Resolution (ISR) The goal of this project is to upscale and improve the quality of low resolution images. This project contains Keras impl

idealo 4k Jan 08, 2023
Model Zoo for MindSpore

Welcome to the Model Zoo for MindSpore In order to facilitate developers to enjoy the benefits of MindSpore framework, we will continue to add typical

MindSpore 226 Jan 07, 2023
Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds Introduction This is the official PyTorch implementation of o

Yijia Weng 96 Dec 07, 2022
Randstad Artificial Intelligence Challenge (powered by VGEN). Soluzione proposta da Stefano Fiorucci (anakin87) - primo classificato

Randstad Artificial Intelligence Challenge (powered by VGEN) Soluzione proposta da Stefano Fiorucci (anakin87) - primo classificato Struttura director

Stefano Fiorucci 1 Nov 13, 2021
HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

Yash Sanjay Bhalgat 616 Jan 06, 2023
A Lightweight Hyperparameter Optimization Tool 🚀

Lightweight Hyperparameter Optimization 🚀 The mle-hyperopt package provides a simple and intuitive API for hyperparameter optimization of your Machin

136 Jan 08, 2023
Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

This repo is the official implementation of "Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework". @inproceedings{zhou2021insta

34 Dec 31, 2022
A U-Net combined with a variational auto-encoder that is able to learn conditional distributions over semantic segmentations.

Probabilistic U-Net + **Update** + An improved Model (the Hierarchical Probabilistic U-Net) + LIDC crops is now available. See below. Re-implementatio

Simon Kohl 498 Dec 26, 2022
GNN4Traffic - This is the repository for the collection of Graph Neural Network for Traffic Forecasting

GNN4Traffic - This is the repository for the collection of Graph Neural Network for Traffic Forecasting

564 Jan 02, 2023
Constrained Logistic Regression - How to apply specific constraints to logistic regression's coefficients

Constrained Logistic Regression Sample implementation of constructing a logistic regression with given ranges on each of the feature's coefficients (v

1 Dec 29, 2021
Unofficial PyTorch implementation of MobileViT.

MobileViT Overview This is a PyTorch implementation of MobileViT specified in "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Tr

Chin-Hsuan Wu 348 Dec 23, 2022
A tensorflow implementation of GCN-LPA

GCN-LPA This repository is the implementation of GCN-LPA (arXiv): Unifying Graph Convolutional Neural Networks and Label Propagation Hongwei Wang, Jur

Hongwei Wang 83 Nov 28, 2022