Code for our CVPR 2022 Paper "GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection"

Related tags

Deep Learninggen-vlkt
Overview

GEN-VLKT

Code for our CVPR 2022 paper "GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection".

Contributed by Yue Liao*, Aixi Zhang*, Miao Lu, Yongliang Wang, Xiaobo Li and Si Liu.

Installation

Installl the dependencies.

pip install -r requirements.txt

Clone and build CLIP.

git clone https://github.com/openai/CLIP.git && cd CLIP && python setup.py develop && cd ..

Data preparation

HICO-DET

HICO-DET dataset can be downloaded here. After finishing downloading, unpack the tarball (hico_20160224_det.tar.gz) to the data directory.

Instead of using the original annotations files, we use the annotation files provided by the PPDM authors. The annotation files can be downloaded from here. The downloaded annotation files have to be placed as follows.

data
 └─ hico_20160224_det
     |─ annotations
     |   |─ trainval_hico.json
     |   |─ test_hico.json
     |   └─ corre_hico.npy
     :

V-COCO

First clone the repository of V-COCO from here, and then follow the instruction to generate the file instances_vcoco_all_2014.json. Next, download the prior file prior.pickle from here. Place the files and make directories as follows.

GEN-VLKT
 |─ data
 │   └─ v-coco
 |       |─ data
 |       |   |─ instances_vcoco_all_2014.json
 |       |   :
 |       |─ prior.pickle
 |       |─ images
 |       |   |─ train2014
 |       |   |   |─ COCO_train2014_000000000009.jpg
 |       |   |   :
 |       |   └─ val2014
 |       |       |─ COCO_val2014_000000000042.jpg
 |       |       :
 |       |─ annotations
 :       :

For our implementation, the annotation file have to be converted to the HOIA format. The conversion can be conducted as follows.

PYTHONPATH=data/v-coco \
        python convert_vcoco_annotations.py \
        --load_path data/v-coco/data \
        --prior_path data/v-coco/prior.pickle \
        --save_path data/v-coco/annotations

Note that only Python2 can be used for this conversion because vsrl_utils.py in the v-coco repository shows a error with Python3.

V-COCO annotations with the HOIA format, corre_vcoco.npy, test_vcoco.json, and trainval_vcoco.json will be generated to annotations directory.

Pre-trained model

Download the pretrained model of DETR detector for ResNet50, and put it to the params directory.

python ./tools/convert_parameters.py \
        --load_path params/detr-r50-e632da11.pth \
        --save_path params/detr-r50-pre-2branch-hico.pth \
        --num_queries 64

python ./tools/convert_parameters.py \
        --load_path params/detr-r50-e632da11.pth \
        --save_path params/detr-r50-pre-2branch-vcoco.pth \
        --dataset vcoco \
        --num_queries 64

Training

After the preparation, you can start training with the following commands. The whole training is split into two steps: GEN-VLKT base model training and dynamic re-weighting training. The trainings of GEN-VLKT-S for HICO-DET and V-COCO are shown as follows.

HICO-DET

sh ./config/hico_s.sh

V-COCO

sh ./configs/vcoco_s.sh

Zero-shot

sh ./configs/hico_s_zs_nf_uc.sh

Evaluation

HICO-DET

You can conduct the evaluation with trained parameters for HICO-DET as follows.

python -m torch.distributed.launch \
        --nproc_per_node=8 \
        --use_env \
        main.py \
        --pretrained pretrained/hico_gen_vlkt_s.pth \
        --dataset_file hico \
        --hoi_path data/hico_20160224_det \
        --num_obj_classes 80 \
        --num_verb_classes 117 \
        --backbone resnet50 \
        --num_queries 64 \
        --dec_layers 3 \
        --eval \
        --with_clip_label \
        --with_obj_clip_label \
        --use_nms_filter

For the official evaluation (reported in paper), you need to covert the prediction file to a official prediction format following this file, and then follow PPDM evaluation steps.

V-COCO

Firstly, you need the add the following main function to the vsrl_eval.py in data/v-coco.

if __name__ == '__main__':
  import sys

  vsrl_annot_file = 'data/vcoco/vcoco_test.json'
  coco_file = 'data/instances_vcoco_all_2014.json'
  split_file = 'data/splits/vcoco_test.ids'

  vcocoeval = VCOCOeval(vsrl_annot_file, coco_file, split_file)

  det_file = sys.argv[1]
  vcocoeval._do_eval(det_file, ovr_thresh=0.5)

Next, for the official evaluation of V-COCO, a pickle file of detection results have to be generated. You can generate the file with the following command. and then evaluate it as follows.

python generate_vcoco_official.py \
        --param_path pretrained/VCOCO_GEN_VLKT_S.pth \
        --save_path vcoco.pickle \
        --hoi_path data/v-coco \
        --num_queries 64 \
        --dec_layers 3 \
        --use_nms_filter \
        --with_clip_label \
        --with_obj_clip_label

cd data/v-coco
python vsrl_eval.py vcoco.pickle

Zero-shot

python -m torch.distributed.launch \
        --nproc_per_node=8 \
        --use_env \
        main.py \
        --pretrained pretrained/hico_gen_vlkt_s.pth \
        --dataset_file hico \
        --hoi_path data/hico_20160224_det \
        --num_obj_classes 80 \
        --num_verb_classes 117 \
        --backbone resnet50 \
        --num_queries 64 \
        --dec_layers 3 \
        --eval \
        --with_clip_label \
        --with_obj_clip_label \
        --use_nms_filter \
        --zero_shot_type rare_first \
        --del_unseen

Regular HOI Detection Results

HICO-DET

Full (D) Rare (D) Non-rare (D) Full(KO) Rare (KO) Non-rare (KO) Download Conifg
GEN-VLKT-S (R50) 33.75 29.25 35.10 36.78 32.75 37.99 model config
GEN-VLKT-M* (R101) 34.63 30.04 36.01 37.97 33.72 39.24 model config
GEN-VLKT-L (R101) 34.95 31.18 36.08 38.22 34.36 39.37 model config

D: Default, KO: Known object, *: The original model is lost and the provided checkpoint performance is slightly different from the paper reported.

V-COCO

Scenario 1 Scenario 2 Download Config
GEN-VLKT-S (R50) 62.41 64.46 model config
GEN-VLKT-M (R101) 63.28 65.58 model config
GEN-VLKT-L (R101) 63.58 65.93 model config

Zero-shot HOI Detection Results

Type Unseen Seen Full Download Conifg
GEN-VLKT-S RF-UC 21.36 32.91 30.56 model config
GEN-VLKT-S NF-UC 25.05 23.38 23.71 model config
GEN-VLKT-S UO 10.51 28.92 25.63 model config
GEN-VLKT-S UV 20.96 30.23 28.74 model config

Citation

Please consider citing our paper if it helps your research.

@inproceedings{liao2022genvlkt,
  title={GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection},
  author={Yue Liao, Aixi Zhang, Miao Lu, Yongliang Wang, Xiaobo Li, Si Liu},
  booktitle={CVPR},
  year={2022}
}

License

GEN-VLKT is released under the MIT license. See LICENSE for additional details.

Acknowledge

Some of the codes are built upon PPDM, DETR, QPIC and CDN. Thanks them for their great works!

Owner
Yue Liao
PhD candidate at Beihang University
Yue Liao
Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis [Paper] [Online Demo] The following results are obtained by our SCUNet with purely syn

Kai Zhang 312 Jan 07, 2023
Implementation of ConvMixer for "Patches Are All You Need? 🤷"

Patches Are All You Need? 🤷 This repository contains an implementation of ConvMixer for the ICLR 2022 submission "Patches Are All You Need?" by Asher

CMU Locus Lab 934 Jan 08, 2023
Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases. Ivy wraps the functional APIs of existing frameworks. Framework-agnostic functions, libraries an

Ivy 8.2k Jan 02, 2023
ProMP: Proximal Meta-Policy Search

ProMP: Proximal Meta-Policy Search Implementations corresponding to ProMP (Rothfuss et al., 2018). Overall this repository consists of two branches: m

Jonas Rothfuss 212 Dec 20, 2022
RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

Phil Wang 556 Jan 04, 2023
Official PyTorch implementation of PICCOLO: Point-Cloud Centric Omnidirectional Localization (ICCV 2021)

Official PyTorch implementation of PICCOLO: Point-Cloud Centric Omnidirectional Localization (ICCV 2021)

16 Nov 19, 2022
Codes for the AAAI'22 paper "TransZero: Attribute-guided Transformer for Zero-Shot Learning"

TransZero [arXiv] This repository contains the testing code for the paper "TransZero: Attribute-guided Transformer for Zero-Shot Learning" accepted to

Shiming Chen 52 Jan 01, 2023
SGPT: Multi-billion parameter models for semantic search

SGPT: Multi-billion parameter models for semantic search This repository contains code, results and pre-trained models for the paper SGPT: Multi-billi

Niklas Muennighoff 182 Dec 29, 2022
Bayesian Optimization using GPflow

Note: This package is for use with GPFlow 1. For Bayesian optimization using GPFlow 2 please see Trieste, a joint effort with Secondmind. GPflowOpt GP

GPflow 257 Dec 26, 2022
The AWS Certified SysOps Administrator

The AWS Certified SysOps Administrator – Associate (SOA-C02) exam is intended for system administrators in a cloud operations role who have at least 1 year of hands-on experience with deployment, man

Aiden Pearce 32 Dec 11, 2022
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •

Pytorch Lightning 21.1k Jan 08, 2023
PyTorch code for the ICCV'21 paper: "Always Be Dreaming: A New Approach for Class-Incremental Learning"

Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning PyTorch code for the ICCV 2021 paper: Always Be Dreaming: A New Approach f

49 Dec 21, 2022
Graph Convolutional Neural Networks with Data-driven Graph Filter (GCNN-DDGF)

Graph Convolutional Gated Recurrent Neural Network (GCGRNN) Improved from Graph Convolutional Neural Networks with Data-driven Graph Filter (GCNN-DDGF

Lei Lin 21 Dec 18, 2022
AI-Bot - 一个基于watermelon改造的OpenAI-GPT-2的智能机器人

AI-Bot 一个基于watermelon改造的OpenAI-GPT-2的智能机器人 在Binder上直接运行测试 目前有两种实现方式 TF2的GPT-2 TF

9 Nov 16, 2022
Image Restoration Using Swin Transformer for VapourSynth

SwinIR SwinIR function for VapourSynth, based on https://github.com/JingyunLiang/SwinIR. Dependencies NumPy PyTorch, preferably with CUDA. Note that t

Holy Wu 11 Jun 19, 2022
Deep Federated Learning for Autonomous Driving

FADNet: Deep Federated Learning for Autonomous Driving Abstract Autonomous driving is an active research topic in both academia and industry. However,

AIOZ AI 12 Dec 01, 2022
Milano is a tool for automating hyper-parameters search for your models on a backend of your choice.

Milano (This is a research project, not an official NVIDIA product.) Documentation https://nvidia.github.io/Milano Milano (Machine learning autotuner

NVIDIA Corporation 147 Dec 17, 2022
A flexible submap-based framework towards spatio-temporally consistent volumetric mapping and scene understanding.

Panoptic Mapping This package contains panoptic_mapping, a general framework for semantic volumetric mapping. We provide, among other, a submap-based

ETHZ ASL 194 Dec 20, 2022
Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Self-supervised Graph-level Representation Learning with Local and Global Structure Introduction This project is an implementation of ``Self-supervise

MilaGraph 50 Dec 09, 2022
Heterogeneous Temporal Graph Neural Network

Heterogeneous Temporal Graph Neural Network This repository contains the datasets and source code of HTGNN. run_mag.ipynb is the training and testing

15 Dec 22, 2022