Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Overview

Contrast and Mix (CoMix)

The repository contains the codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing part of Advances in Neural Information Processing Systems (NeurIPS) 2021.

Aadarsh Sahoo1, Rutav Shah1, Rameswar Panda2, Kate Saenko2,3, Abir Das1

1 IIT Kharagpur, 2 MIT-IBM Watson AI Lab, 3 Boston University

[Paper] [Project Page]

 

Fig. Temporal Contrastive Learning with Background Mixing and Target Pseudo-labels. Temporal contrastive loss (left) contrasts a single temporally augmented positive (same video, different speed) per anchor against rest of the videos in a mini-batch as negatives. Incorporating background mixing (middle) provides additional positives per anchor possessing same action semantics with a different background alleviating background shift across domains. Incorporating target pseudo-labels (right) additionally enhances the discriminabilty by contrasting the target videos with the same pseudo-label as positives against rest of the videos as negatives.

 

Preparing the Environment

Conda

Please use the comix_environment.yml file to create the conda environment comix as:

conda env create -f comix_environment.yml

Pip

Please use the requirements.txt file to install all the required dependencies as:

pip install -r requirements.txt

Data Directory Structure

All the datasets should be stored in the folder ./data following the convention ./data/ and it must be passed as an argument to base_dir=./data/ .

UCF - HMDB

For ucf_hmdb dataset with base_dir=./data/ucf_hmdb the structure would be as follows:

.
├── ...
├── data
│   ├── ucf_hmdb
│   │   ├── ucf_videos
|   |   |   ├── 
   
    
|   |   |   |   ├── 
    
     
|   |   |   |   ├── 
     
      
|   |   |   |   ├── ...
|   |   |   ├── 
      
       
|   |   |   ├── ...
│   │   ├── hmdb_videos
|   |   ├── ucf_BG
|   |   └── hmdb_BG
│   └──
└──

      
     
    
   
Jester

For Jester dataset with base_dir=./data/jester the structure would be as follows

.
├── ...
├── data
│   ├── jester
|   |   ├── jester_videos
|   |   |   ├── 
   
    
|   |   |   |   ├── 
    
     
|   |   |   |   ├── 
     
      
|   |   |   |   ├── ...
|   |   |   ├── 
      
       
|   |   |   ├── ...
|   |   ├── jester_BG
|   |   |   ├── 
       
         | | | | ├── 
        
          | | | ├── ... └── └── └── 
        
       
      
     
    
   
Epic-Kitchens

For Epic Kitchens dataset with base_dir=./data/epic_kitchens the structure would be as follows (we follow the same structure as in the original dataset) :

.
├── ...
├── data
│   ├── epic_kitchens
|   |   ├── epic_kitchens_videos
|   |   |   ├── train
|   |   |   |   ├── D1
|   |   |   |   |   ├── 
   
    
|   |   |   |   |   |   ├── 
    
     
|   |   |   |   |   |   ├── 
     
      
|   |   |   |   |   |   ├── ...
|   |   |   |   |   ├── 
      
       
|   |   |   |   |   ├── ...
|   |   |   |   ├── D2
|   |   |   |   └── D3
|   |   |   └── test
└── └── └── epic_kitchens_BG

      
     
    
   

For using datasets stored in some other directories, please pass the parameter base_dir accordingly.

Background Extraction using Temporal Median Filtering

Please refer to the folder ./background_extraction for the codes to extract backgrounds using temporal median filtering.

Data

All the required split files are provided inside the directory ./video_splits.

The official download links for the datasets used for this paper are: [UCF-101] [HMDB-51] [Jester] [Epic Kitchens]

Training CoMix

Here are some of the sample and recomended commands to train CoMix for the transfer task of:

UCF -> HMDB from UCF-HMDB dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name UCF-HMDB --src_dataset UCF --tgt_dataset HMDB --batch_size 8 --model_root ./checkpoints_ucf_hmdb --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.1 --lambda_tpl 0.01 --base_dir ./data/ucf_hmdb

S -> T from Jester dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name Jester --src_dataset S --tgt_dataset T --batch_size 8 --model_root ./checkpoints_jester --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.1 --lambda_tpl 0.1 --base_dir ./data/jester

D1 -> D2 from Epic-Kitchens dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name Epic-Kitchens --src_dataset D1 --tgt_dataset D2 --batch_size 8 --model_root ./checkpoints_epic_d1_d2 --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.01 --lambda_tpl 0.01 --base_dir ./data/epic_kitchens

For detailed description regarding the arguments, use:

python main.py --help

Citing CoMix

If you use codes in this repository, consider citing CoMix. Thanks!

@article{sahoo2021contrast,
  title={Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing},
  author={Sahoo, Aadarsh and Shah, Rutav and Panda, Rameswar and Saenko, Kate and Das, Abir},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}
Owner
Computer Vision and Intelligence Research (CVIR)
The Computer Vision and Intelligence Research (CVIR) group is part of the Department of Computer Science and Engineering at IIT Kharagpur.
Computer Vision and Intelligence Research (CVIR)
Implementation of ICCV 2021 oral paper -- A Novel Self-Supervised Learning for Gaussian Mixture Model

SS-GMM Implementation of ICCV 2021 oral paper -- Self-Supervised Image Prior Learning with GMM from a Single Noisy Image with supplementary material R

HUST-The Tan Lab 4 Dec 05, 2022
Keras Image Embeddings using Contrastive Loss

Keras-Image-Embeddings-using-Contrastive-Loss Image to Embedding projection in vector space. Implementation in keras and tensorflow for custom data. B

Shravan Anand K 5 Mar 21, 2022
PyTorch implementation of ICLR 2022 paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

王皓波 147 Jan 07, 2023
Learning Temporal Consistency for Low Light Video Enhancement from Single Images (CVPR2021)

StableLLVE This is a Pytorch implementation of "Learning Temporal Consistency for Low Light Video Enhancement from Single Images" in CVPR 2021, by Fan

99 Dec 19, 2022
A PyTorch Library for Accelerating 3D Deep Learning Research

Kaolin: A Pytorch Library for Accelerating 3D Deep Learning Research Overview NVIDIA Kaolin library provides a PyTorch API for working with a variety

NVIDIA GameWorks 3.5k Jan 07, 2023
3D detection and tracking viewer (visualization) for kitti & waymo dataset

3D detection and tracking viewer (visualization) for kitti & waymo dataset

222 Jan 08, 2023
Advbox is a toolbox to generate adversarial examples that fool neural networks in PaddlePaddle、PyTorch、Caffe2、MxNet、Keras、TensorFlow and Advbox can benchmark the robustness of machine learning models.

Advbox is a toolbox to generate adversarial examples that fool neural networks in PaddlePaddle、PyTorch、Caffe2、MxNet、Keras、TensorFlow and Advbox can benchmark the robustness of machine learning models

AdvBox 1.3k Dec 25, 2022
Adaptation through prediction: multisensory active inference torque control

Adaptation through prediction: multisensory active inference torque control Submitted to IEEE Transactions on Cognitive and Developmental Systems Abst

Cristian Meo 1 Nov 07, 2022
Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters"

Manga Character Screentone Synthesis Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters" presented in IEEE ISM 2

Tsubota 2 Nov 20, 2021
unofficial pytorch implement of "Squareplus: A Softplus-Like Algebraic Rectifier"

SquarePlus (Pytorch implement) unofficial pytorch implement of "Squareplus: A Softplus-Like Algebraic Rectifier" SquarePlus Squareplus is a Softplus-L

SeeFun 3 Dec 29, 2021
IPATool-py: download ipa easily

IPATool-py Python version of IPATool! Installation pip3 install -r requirements.txt Usage Quickstart: download app with specific bundleId into DIR: p

159 Dec 30, 2022
Image Fusion Transformer

Image-Fusion-Transformer Platform Python 3.7 Pytorch =1.0 Training Dataset MS-COCO 2014 (T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ram

Vibashan VS 68 Dec 23, 2022
x-transformers-paddle 2.x version

x-transformers-paddle x-transformers-paddle 2.x version paddle 2.x版本 https://github.com/lucidrains/x-transformers 。 requirements paddlepaddle-gpu==2.2

yujun 7 Dec 08, 2022
Direct design of biquad filter cascades with deep learning by sampling random polynomials.

IIRNet Direct design of biquad filter cascades with deep learning by sampling random polynomials. Usage git clone https://github.com/csteinmetz1/IIRNe

Christian J. Steinmetz 55 Nov 02, 2022
Official implementation of "Refiner: Refining Self-attention for Vision Transformers".

RefinerViT This repo is the official implementation of "Refiner: Refining Self-attention for Vision Transformers". The repo is build on top of timm an

101 Dec 29, 2022
Compare GAN code.

Compare GAN This repository offers TensorFlow implementations for many components related to Generative Adversarial Networks: losses (such non-saturat

Google 1.8k Jan 05, 2023
Shared Attention for Multi-label Zero-shot Learning

Shared Attention for Multi-label Zero-shot Learning Overview This repository contains the implementation of Shared Attention for Multi-label Zero-shot

dathuynh 26 Dec 14, 2022
Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

Wenjing Wang 77 Dec 08, 2022
Intent parsing and slot filling in PyTorch with seq2seq + attention

PyTorch Seq2Seq Intent Parsing Reframing intent parsing as a human - machine translation task. Work in progress successor to torch-seq2seq-intent-pars

Sean Robertson 160 Jan 07, 2023
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer

Spotify 10.6k Jan 04, 2023