Contrastive Feature Loss for Image Prediction

Overview

Python 3.6

Contrastive Feature Loss for Image Prediction

We provide a PyTorch implementation of our contrastive feature loss presented in:

Contrastive Feature Loss for Image Prediction Alex Andonian, Taesung Park, Bryan Russell, Phillip Isola, Jun-Yan Zhu, Richard Zhang

Presented in AIM Workshop at ICCV 2021

Prerequisites

  • Linux or macOS
  • Python 3.6+
  • CPU or NVIDIA GPU + CUDA CuDNN

Table of Contents:

  1. Setup
  2. Dataset Preprocessing
  3. Training
  4. Evaluating and Visualizing

Setup

  • Clone this repo:
git clone https://github.com/alexandonian/contrastive-feature-loss.git
cd contrastive-feature-loss
  • Create python virtual environment

  • Install a recent version of PyTorch and other dependencies specified below.

We highly recommend that you install additional dependencies in an isolated python virtual environment (of your choosing). For Conda+pip users, you can create a new conda environment and then pip install dependencies with the following snippet:

ENV_NAME=contrastive-feature-loss
conda create --name $ENV_NAME python=3.8
conda activate $ENV_NAME
pip install -r requirements.txt

Alternatively, you can create a new Conda environment in one command using conda env create -f environment.yml, followed by conda activate contrastive-feature-loss to activate the environment.

This code also requires the Synchronized-BatchNorm-PyTorch rep.

cd models/networks/
git clone https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
cp -rf Synchronized-BatchNorm-PyTorch/sync_batchnorm .
cd ../../

Dataset Preparation

For COCO-Stuff, Cityscapes or ADE20K, the datasets must be downloaded beforehand. Please download them on the respective webpages. In the case of COCO-stuff, we put a few sample images in this code repo.

Preparing COCO-Stuff Dataset. The dataset can be downloaded here. In particular, you will need to download train2017.zip, val2017.zip, stuffthingmaps_trainval2017.zip, and annotations_trainval2017.zip. The images, labels, and instance maps should be arranged in the same directory structure as in datasets/coco_stuff/. In particular, we used an instance map that combines both the boundaries of "things instance map" and "stuff label map". To do this, we used a simple script datasets/coco_generate_instance_map.py. Please install pycocotools using pip install pycocotools and refer to the script to generate instance maps.

Preparing ADE20K Dataset. The dataset can be downloaded here, which is from MIT Scene Parsing BenchMark. After unzipping the datgaset, put the jpg image files ADEChallengeData2016/images/ and png label files ADEChallengeData2016/annotatoins/ in the same directory.

There are different modes to load images by specifying --preprocess_mode along with --load_size. --crop_size. There are options such as resize_and_crop, which resizes the images into square images of side length load_size and randomly crops to crop_size. scale_shortside_and_crop scales the image to have a short side of length load_size and crops to crop_size x crop_size square. To see all modes, please use python train.py --help and take a look at data/base_dataset.py. By default at the training phase, the images are randomly flipped horizontally. To prevent this use --no_flip.

Training Models

Models can be trained with the following steps.

  1. Ensure you have prepared the dataset of interest using the instruction above. To train on the one of the datasets listed above, you can download the datasets and use --dataset_mode option, which will choose which subclass of BaseDataset is loaded. For custom datasets, the easiest way is to use ./data/custom_dataset.py by specifying the option --dataset_mode custom, along with --label_dir [path_to_labels] --image_dir [path_to_images]. You also need to specify options such as --label_nc for the number of label classes in the dataset, --contain_dontcare_label to specify whether it has an unknown label, or --no_instance to denote the dataset doesn't have instance maps.

  2. Run the training script with the following command:

# To train on the COCO dataset, for example.
python train.py --name [experiment_name] --dataset_mode coco --dataroot [path_to_coco_dataset]

# To train on your own custom dataset
python train.py --name [experiment_name] --dataset_mode custom --label_dir [path_to_labels] --image_dir [path_to_images] --label_nc [num_labels]

There are many options you can specify. Please use python train.py --help. The specified options are printed to the console. To specify the number of GPUs to utilize, use --gpu_ids. If you want to use the second and third GPUs for example, use --gpu_ids 1,2.

The training logs are stored in json format in [checkpoints_dir]/[experiment_name]/log.json, with sample generations populated in [checkpoint_dir]/[experiment_name]/web. We provide support for both Tensorboard (with --tf_log) and Weights & Biases (with --use_wandb) experiment tracking.

Evaluating and Visualizing

In order to evaluate and visualize the generations of a trained model, run test.py in a similar manner, specifying the name of the experiment, the dataset and its path:

python test.py --name [name_of_experiment] --dataset_mode [dataset_mode] --dataroot [path_to_dataset]

where [name_of_experiment] is the directory name of the checkpoint created during training. If you are running on CPU mode, append --gpu_ids -1.

Use --results_dir to specify the output directory. --num_test will specify the maximum number of images to generate. By default, it loads the latest checkpoint. It can be changed using --which_epoch.

Code Structure

  • train.py, test.py: the entry point for training and testing.
  • trainers/contrastive_pix2pix_trainer.py: harnesses and reports the progress of training of contrastive model.
  • models/contrastive_pix2pix_model.py: creates the networks, and compute the losses
  • models/networks/: defines the architecture of all models
  • models/networks/loss: contains proposed PatchNCE loss
  • options/: creates option lists using argparse package. More individuals are dynamically added in other files as well. Please see the section below.
  • data/: defines the class for loading images and label maps.

Options

This code repo contains many options. Some options belong to only one specific model, and some options have different default values depending on other options. To address this, the BaseOption class dynamically loads and sets options depending on what model, network, and datasets are used. This is done by calling the static method modify_commandline_options of various classes. It takes in theparser of argparse package and modifies the list of options. For example, since COCO-stuff dataset contains a special label "unknown", when COCO-stuff dataset is used, it sets --contain_dontcare_label automatically at data/coco_dataset.py. You can take a look at def gather_options() of options/base_options.py, or models/network/__init__.py to get a sense of how this works.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{andonian2021contrastive,
  title={Contrastive Feature Loss for Image Prediction},
  author={Andonian, Alex and Park, Taesung and Russell, Bryan and Isola, Phillip and Zhu, Jun-Yan and Zhang, Richard},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={1934--1943},
  year={2021}
}

Acknowledgments

This code borrows heavily from pix2pixHD and SPADE and CUT. We thank Jiayuan Mao for his Synchronized Batch Normalization code.

Owner
Alex Andonian
PhD student in EECS at MIT, studying topics in computer vision, machine learning, and AI.
Alex Andonian
Reinforcement Learning for Portfolio Management

qtrader Reinforcement Learning for Portfolio Management Why Reinforcement Learning? Learns the optimal action, rather than models the market. Adaptive

Angelos Filos 406 Jan 01, 2023
Practical Single-Image Super-Resolution Using Look-Up Table

Practical Single-Image Super-Resolution Using Look-Up Table [Paper] Dependency Python 3.6 PyTorch glob numpy pillow tqdm tensorboardx 1. Training deep

Younghyun Jo 116 Dec 23, 2022
Planar Prior Assisted PatchMatch Multi-View Stereo

ACMP [News] The code for ACMH is released!!! [News] The code for ACMM is released!!! About This repository contains the code for the paper Planar Prio

Qingshan Xu 127 Dec 31, 2022
A curated list of awesome deep long-tailed learning resources.

A curated list of awesome deep long-tailed learning resources.

vanint 210 Dec 25, 2022
DRIFT is a tool for Diachronic Analysis of Scientific Literature.

About DRIFT is a tool for Diachronic Analysis of Scientific Literature. The application offers user-friendly and customizable utilities for two modes:

Rajaswa Patil 108 Dec 12, 2022
Transparent Transformer Segmentation

Transparent Transformer Segmentation Introduction This repository contains the data and code for IJCAI 2021 paper Segmenting transparent object in the

谢恩泽 140 Jan 02, 2023
NeurIPS 2021 Datasets and Benchmarks Track

AP-10K: A Benchmark for Animal Pose Estimation in the Wild Introduction | Updates | Overview | Download | Training Code | Key Questions | License Intr

AP-10K 82 Dec 11, 2022
Measuring and Improving Consistency in Pretrained Language Models

ParaRel 🤘 This repository contains the code and data for the paper: Measuring and Improving Consistency in Pretrained Language Models as well as the

Yanai Elazar 26 Dec 02, 2022
Official repository for Hierarchical Opacity Propagation for Image Matting

HOP-Matting Official repository for Hierarchical Opacity Propagation for Image Matting 🚧 🚧 🚧 Under Construction 🚧 🚧 🚧 🚧 🚧 🚧   Coming Soon   

Li Yaoyi 54 Dec 30, 2021
"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

NAS-Bench-301 This repository containts code for the paper: "NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search". The

AutoML-Freiburg-Hannover 57 Nov 30, 2022
Pytorch reimplementation of the Mixer (MLP-Mixer: An all-MLP Architecture for Vision)

MLP-Mixer Pytorch reimplementation of Google's repository for the MLP-Mixer (Not yet updated on the master branch) that was released with the paper ML

Eunkwang Jeon 18 Dec 08, 2022
Api's bulid in Flask perfom to manage Todo Task.

Citymall-task Api's bulid in Flask perfom to manage Todo Task. Installation Requrements : Python: 3.10.0 MongoDB create .env file with variables DB_UR

Aisha Tayyaba 1 Dec 17, 2021
X-modaler is a versatile and high-performance codebase for cross-modal analytics.

X-modaler X-modaler is a versatile and high-performance codebase for cross-modal analytics. This codebase unifies comprehensive high-quality modules i

910 Dec 28, 2022
ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI'22)

ShuttleNet: Position-aware Rally Progress and Player Styles Fusion for Stroke Forecasting in Badminton (AAAI 2022) Official code of the paper ShuttleN

Wei-Yao Wang 11 Nov 30, 2022
Repository for MeshTalk supplemental material and code once the (already approved) 16 GHS captures our lab will make publicly available are released.

meshtalk This repository contains code to run MeshTalk for face animation from audio. If you use MeshTalk, please cite @inproceedings{richard2021mesht

Meta Research 221 Jan 06, 2023
A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval project page | arXiv | webvid-data Repository containing the code,

225 Dec 25, 2022
The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

This repository is the official PyTorch implementation of SAINT. Find the paper on arxiv SAINT: Improved Neural Networks for Tabular Data via Row Atte

Gowthami Somepalli 284 Dec 21, 2022
A cross-lingual COVID-19 fake news dataset

CrossFake An English-Chinese COVID-19 fake&real news dataset from the ICDMW 2021 paper below: Cross-lingual COVID-19 Fake News Detection. Jiangshu Du,

Yingtong Dou 11 Dec 01, 2022
Repository accompanying the "Sign Pose-based Transformer for Word-level Sign Language Recognition" paper

by Matyáš Boháček and Marek Hrúz, University of West Bohemia Should you have any questions or inquiries, feel free to contact us here. Repository acco

Matyáš Boháček 30 Dec 30, 2022
Code for MarioNette: Self-Supervised Sprite Learning, in NeurIPS 2021

MarioNette | Webpage | Paper | Video MarioNette: Self-Supervised Sprite Learning Dmitriy Smirnov, Michaël Gharbi, Matthew Fisher, Vitor Guizilini, Ale

Dima Smirnov 28 Nov 18, 2022