Code for Greedy Gradient Ensemble for Visual Question Answering (ICCV 2021, Oral)

Related tags

Deep LearningGGE
Overview

Greedy Gradient Ensemble for De-biased VQA

Code release for "Greedy Gradient Ensemble for Robust Visual Question Answering" (ICCV 2021, Oral). GGE can extend to other tasks with dataset biases.

@inproceedings{han2015greedy,
	title={Greedy Gradient Ensemble for Robust Visual Question Answering},
	author={Han, Xinzhe and Wang, Shuhui and Su, Chi and Huang, Qingming and Tian, Qi},
	booktitle={Proceedings of the IEEE international conference on computer vision},
	year={2021}
}

Prerequisites

We use Anaconda to manage our dependencies . You will need to execute the following steps to install all dependencies:

  • Edit the value for prefix variable in requirements.yml file, by assigning it the path to conda environment

  • Then, install all dependencies using: conda env create -f requirements.yml

  • Change to the new environment: bias

Data Setup

  • Download UpDn features from google drive into /data/detection_features folder
  • Download questions/answers for VQAv2 and VQA-CPv2 by executing bash tools/download.sh
  • Download visual cues/hints provided in A negative case analysis of visual grounding methods for VQA into data/hints. Note that we use caption based hints for grounding-based method reproduction, CGR and CGW.
  • Preprocess process the data with bash tools/process.sh

Training GGE

Run

CUDA_VISIBLE_DEVICES=0 python main.py --dataset cpv2 --mode MODE --debias gradient --topq 1 --topv -1 --qvp 5 --output [] 

to train a model. In main.py, import base_model for UpDn baseline; import base_model_ban as base_model for BAN baseline; import base_model_block as base_model for S-MRL baseline.

Set MODE as gge_iter and gge_tog for our best performance model; gge_d_bias and gge_q_bias for single bias ablation; base for baseline model.

Training ablations in Sec. 3 and Sec. 5

For models in Sec. 3, execute from train_ab import train and import base_model_ab as base_model in main.py. Run

CUDA_VISIBLE_DEVICES=0 python main.py --dataset cpv2 --mode MODE --debias METHODS --topq 1 --topv -1 --qvp 5 --output [] 

METHODS learned_mixin for LMH, MODE inv_sup for inv_sup strategy, v_inverse for inverse hint. Note that the results for HINT$_inv$ is obtained by running the code from A negative case analysis of visual grounding methods for VQA.

To test v_only model, import base_model_v_only as base_model in main.py.

To test RUBi and LMH+RUBi, run

CUDA_VISIBLE_DEVICES=0 python rubi_main.py --dataset cpv2 --mode MODE --output [] 

MODE updn is for RUBi, lmh_rubi is for LMH+RUBi.

Testing

For test stage, we output the overall Acc, CGR, CGW and CGD at threshold 0.2. change base_model to corresponding model in sensitivity.py and run

CUDA_VISIBLE_DEVICES=0 python sensitivity.py --dataset cpv2 --debias METHOD --load_checkpoint_path logs/your_path --output your_path

Visualization

We provide visualization in visualization.ipynb. If you want to see other visualization by yourself, download MS-COCO 2014 to data/images.

Acknowledgements

This repo uses features from A negative case analysis of visual grounding methods for VQA. Some codes are modified from CSS and UpDn.

A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

CSL-YOLO: A New Lightweight Object Detection System for Edge Computing This project provides a SOTA level lightweight YOLO called "Cross-Stage Lightwe

Miles Zhang 54 Dec 21, 2022
[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

SoCo [NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning By Fangyun Wei*, Yue Gao*, Zhirong Wu, Han Hu,

Yue Gao 139 Dec 14, 2022
TensorFlow implementation of ENet

TensorFlow-ENet TensorFlow implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. This model was tested on th

Kwotsin 255 Oct 17, 2022
DCGAN LSGAN WGAN-GP DRAGAN PyTorch

Recommendation Our GAN based work for facial attribute editing - AttGAN. News 8 April 2019: We re-implement these GANs by Tensorflow 2! The old versio

Zhenliang He 408 Nov 30, 2022
Fast and Easy Infinite Neural Networks in Python

Neural Tangents ICLR 2020 Video | Paper | Quickstart | Install guide | Reference docs | Release notes Overview Neural Tangents is a high-level neural

Google 1.9k Jan 09, 2023
Code of the paper "Deep Human Dynamics Prior" in ACM MM 2021.

Code of the paper "Deep Human Dynamics Prior" in ACM MM 2021. Figure 1: In the process of motion capture (mocap), some joints or even the whole human

Shinny cui 3 Oct 31, 2022
[SDM 2022] Towards Similarity-Aware Time-Series Classification

SimTSC This is the PyTorch implementation of SDM2022 paper Towards Similarity-Aware Time-Series Classification. We propose Similarity-Aware Time-Serie

Daochen Zha 49 Dec 27, 2022
Plaything for Autistic Children (demo for PaddlePaddle/Wechaty/Mixlab project)

星星的孩子 - 一款为孤独症孩子设计的聊天机器人游戏 孤独症儿童是目前常常被忽视的一类群体。他们有着类似性格内向的特征,实际却受着广泛性发育障碍的折磨。 项目背景 这类儿童在与人交往时存在着沟通障碍,其特点表现在: 社交交流差,互动障碍明显 认知能力有限,被动认知 兴趣狭窄,重复刻板,缺乏变化和想象

Tianyi Pan 35 Nov 24, 2022
ZEBRA: Zero Evidence Biometric Recognition Assessment

ZEBRA: Zero Evidence Biometric Recognition Assessment license: LGPLv3 - please reference our paper version: 2020-06-11 author: Andreas Nautsch (EURECO

Voice Privacy Challenge 2 Dec 12, 2021
🥇Samsung AI Challenge 2021 1등 솔루션입니다🥇

MoT - Molecular Transformer Large-scale Pretraining for Molecular Property Prediction Samsung AI Challenge for Scientific Discovery This repository is

Jungwoo Park 44 Dec 03, 2022
Supervised multi-SNE (S-multi-SNE): Multi-view visualisation and classification

S-multi-SNE Supervised multi-SNE (S-multi-SNE): Multi-view visualisation and classification A repository containing the code to reproduce the findings

Theodoulos Rodosthenous 3 Apr 15, 2022
Python3 Implementation of (Subspace Constrained) Mean Shift Algorithm in Euclidean and Directional Product Spaces

(Subspace Constrained) Mean Shift Algorithms in Euclidean and/or Directional Product Spaces This repository contains Python3 code for the mean shift a

Yikun Zhang 0 Oct 19, 2021
Artificial Intelligence playing minesweeper 🤖

AI playing Minesweeper ✨ Minesweeper is a single-player puzzle video game. The objective of the game is to clear a rectangular board containing hidden

Vaibhaw 8 Oct 17, 2022
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

AdaFocusV2 This repo contains the official code and pre-trained models for AdaFo

79 Dec 26, 2022
Yolo object detection - Yolo object detection with python

How to run download required files make build_image make download Docker versio

3 Jan 26, 2022
SAGE: Sensitivity-guided Adaptive Learning Rate for Transformers

SAGE: Sensitivity-guided Adaptive Learning Rate for Transformers This repo contains our codes for the paper "No Parameters Left Behind: Sensitivity Gu

Chen Liang 23 Nov 07, 2022
BirdCLEF 2021 - Birdcall Identification 4th place solution

BirdCLEF 2021 - Birdcall Identification 4th place solution My solution detail kaggle discussion Inference Notebook (best submission) Environment Use K

tattaka 42 Jan 02, 2023
TreeSubstitutionCipher - Encryption system based on trees and substitution

Tree Substitution Cipher Generation Algorithm: Generate random tree. Tree nodes

stepa 1 Jan 08, 2022
UFPR-ADMR-v2 Dataset

UFPR-ADMR-v2 Dataset The UFPR-ADMRv2 dataset contains 5,000 dial meter images obtained on-site by employees of the Energy Company of Paraná (Copel), w

Gabriel Salomon 8 Sep 29, 2022
This package implements the algorithms introduced in Smucler, Sapienza, and Rotnitzky (2020) to compute optimal adjustment sets in causal graphical models.

optimaladj: A library for computing optimal adjustment sets in causal graphical models This package implements the algorithms introduced in Smucler, S

Facundo Sapienza 6 Aug 04, 2022