[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Overview

PWC PWC

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021)

[arXiv][Project page >> coming soon]

Sanath Narayan*, Akshita Gupta*, Salman Khan, Fahad Shahbaz Khan, Ling Shao, Mubarak Shah

( 🌟 denotes equal contribution)

Installation

The codebase is built on PyTorch 1.1.0 and tested on Ubuntu 16.04 environment (Python3.6, CUDA9.0, cuDNN7.5).

For installing, follow these intructions

conda create -n mlzsl python=3.6
conda activate mlzsl
conda install pytorch=1.1 torchvision=0.3 cudatoolkit=9.0 -c pytorch
pip install matplotlib scikit-image scikit-learn opencv-python yacs joblib natsort h5py tqdm pandas

Install warmup scheduler

cd pytorch-gradual-warmup-lr; python setup.py install; cd ..

Attention Visualization

Results

Our approach on NUS-WIDE Dataset.

Our approach on OpenImages Dataset.

Training and Evaluation

NUS-WIDE

Step 1: Data preparation

  1. Download pre-computed features from here and store them at features folder inside BiAM/datasets/NUS-WIDE directory.
  2. [Optional] You can extract the features on your own by using the original NUS-WIDE dataset from here and run the below script:
python feature_extraction/extract_nus_wide.py

Step 2: Training from scratch

To train and evaluate multi-label zero-shot learning model on full NUS-WIDE dataset, please run:

sh scripts/train_nus.sh

Step 3: Evaluation using pretrained weights

To evaluate the multi-label zero-shot model on NUS-WIDE. You can download the pretrained weights from here and store them at NUS-WIDE folder inside pretrained_weights directory.

sh scripts/evaluate_nus.sh

OPEN-IMAGES

Step 1: Data preparation

  1. Please download the annotations for training, validation, and testing into this folder.

  2. Store the annotations inside BiAM/datasets/OpenImages.

  3. To extract the features for OpenImages-v4 dataset run the below scripts for crawling the images and extracting features of them:

## Crawl the images from web
python ./datasets/OpenImages/download_imgs.py  #`data_set` == `train`: download images into `./image_data/train/`
python ./datasets/OpenImages/download_imgs.py  #`data_set` == `validation`: download images into `./image_data/validation/`
python ./datasets/OpenImages/download_imgs.py  #`data_set` == `test`: download images into `./image_data/test/`

## Run feature extraction codes for all the 3 splits
python feature_extraction/extract_openimages_train.py
python feature_extraction/extract_openimages_test.py
python feature_extraction/extract_openimages_val.py

Step 2: Training from scratch

To train and evaluate multi-label zero-shot learning model on full OpenImages-v4 dataset, please run:

sh scripts/train_openimages.sh
sh scripts/evaluate_openimages.sh

Step 3: Evaluation using pretrained weights

To evaluate the multi-label zero-shot model on OpenImages. You can download the pretrained weights from here and store them at OPENIMAGES folder inside pretrained_weights directory.

sh scripts/evaluate_openimages.sh

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Citation

If you find this repository useful, please consider giving a star and citation 🎊 :

@article{narayan2021discriminative,
title={Discriminative Region-based Multi-Label Zero-Shot Learning},
author={Narayan, Sanath and Gupta, Akshita and Khan, Salman and  Khan, Fahad Shahbaz and Shao, Ling and Shah, Mubarak},
journal={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
publisher = {IEEE},
year={2021}
}

Contact

Should you have any question, please contact 📧 [email protected]

Owner
Akshita Gupta
Sem @IITR | Outreachy @mozilla | Research Engineer @IIAI
Akshita Gupta
Code repo for "Towards Interpretable Deep Networks for Monocular Depth Estimation" paper.

InterpretableMDE A PyTorch implementation for "Towards Interpretable Deep Networks for Monocular Depth Estimation" paper. arXiv link: https://arxiv.or

Zunzhi You 16 Aug 12, 2022
Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning

Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning

ChongjianGE 89 Dec 02, 2022
Repository for RNNs using TensorFlow and Keras - LSTM and GRU Implementation from Scratch - Simple Classification and Regression Problem using RNNs

RNN 01- RNN_Classification Simple RNN training for classification task of 3 signal: Sine, Square, Triangle. 02- RNN_Regression Simple RNN training for

Nahid Ebrahimian 13 Dec 13, 2022
The implementation for the SportsCap (IJCV 2021)

SportsCap: Monocular 3D Human Motion Capture and Fine-grained Understanding in Challenging Sports Videos ProjectPage | Paper | Video | Dataset (Part01

Chen Xin 79 Dec 16, 2022
Code for "Multi-Time Attention Networks for Irregularly Sampled Time Series", ICLR 2021.

Multi-Time Attention Networks (mTANs) This repository contains the PyTorch implementation for the paper Multi-Time Attention Networks for Irregularly

The Laboratory for Robust and Efficient Machine Learning 68 Dec 17, 2022
This repo is official PyTorch implementation of MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices(CVPRW 2021).

Github Code of "MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices" Introduction This repo is official PyTorch implementatio

Choi Sang Bum 203 Jan 05, 2023
Source code of AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".

Towards End-to-End Image Compression and Analysis with Transformers Source code of our AAAI 2022 paper "Towards End-to-End Image Compression and Analy

37 Dec 21, 2022
Using modified BiSeNet for face parsing in PyTorch

face-parsing.PyTorch Contents Training Demo References Training Prepare training data: -- download CelebAMask-HQ dataset -- change file path in the pr

zll 1.6k Jan 08, 2023
Uni-Fold: Training your own deep protein-folding models

Uni-Fold: Training your own deep protein-folding models. This package provides an implementation of a trainable, Transformer-based deep protein foldin

DP Technology 187 Jan 04, 2023
Transfer Learning library for Deep Neural Networks.

Transfer and meta-learning in Python Each folder in this repository corresponds to a method or tool for transfer/meta-learning. xfer-ml is a standalon

Amazon 245 Dec 08, 2022
A simplistic and efficient pure-python neural network library from Phys Whiz with CPU and GPU support.

A simplistic and efficient pure-python neural network library from Phys Whiz with CPU and GPU support.

Manas Sharma 19 Feb 28, 2022
Marvis is Mastouri's Jarvis version of the AI-powered Python personal assistant.

Marvis v1.0 Marvis is Mastouri's Jarvis version of the AI-powered Python personal assistant. About M.A.R.V.I.S. J.A.R.V.I.S. is a fictional character

Reda Mastouri 1 Dec 29, 2021
Deep Face Recognition in PyTorch

Face Recognition in PyTorch By Alexey Gruzdev and Vladislav Sovrasov Introduction A repository for different experimental Face Recognition models such

Alexey Gruzdev 141 Sep 11, 2022
DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene.

DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene. We achieve NeRF-comparable novel-view synthesis quality with super-fast convergence.

sunset 709 Dec 31, 2022
这是一个facenet-pytorch的库,可以用于训练自己的人脸识别模型。

Facenet:人脸识别模型在Pytorch当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download 预测步骤 How2predict 训练步骤 How2train 参考资料 Reference 性能情况 训练数据

Bubbliiiing 210 Jan 06, 2023
A project to build an AI voice assistant using Python . The Voice assistant interacts with the humans to perform basic tasks.

AI_Personal_Voice_Assistant_Using_Python A project to build an AI voice assistant using Python . The Voice assistant interacts with the humans to perf

Chumui Tripura 1 Oct 30, 2021
Data reduction pipeline for KOALA on the AAT.

KOALA KOALA, the Kilofibre Optical AAT Lenslet Array, is a wide-field, high efficiency, integral field unit used by the AAOmega spectrograph on the 3.

4 Sep 26, 2022
This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021.

inverse_attention This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021. Le

Firas Laakom 5 Jul 08, 2022
Prometheus Exporter for data scraped from datenplattform.darmstadt.de

darmstadt-opendata-exporter Scrapes data from https://datenplattform.darmstadt.de and presents it in the Prometheus Exposition format. Pull requests w

Martin Weinelt 2 Apr 12, 2022
RefineMask (CVPR 2021)

RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features (CVPR 2021) This repo is the official implementation of RefineMask:

Gang Zhang 191 Jan 07, 2023