Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

Overview

How Well Do Self-Supervised Models Transfer?

This repository hosts the code for the experiments in the CVPR 2021 paper How Well Do Self-Supervised Models Transfer?

Requirements

This codebase has been tested with the following package versions:

python=3.6.8
torch=1.2.0
torchvision=0.4.0
PIL=7.1.2
numpy=1.18.1
scipy=1.2.1
pandas=1.0.3
tqdm=4.31.1
sklearn=0.22.2

Pre-trained Models

In the paper we evaluate 14 pre-trained ResNet50 models, 13 self-supervised and 1 supervised. To download and prepare all models in the same format, run:

python download_and_prepare_models.py

This will prepare the models in the same format and save them in a directory named models.

Note 1: For SimCLR-v1 and SimCLR-v2, the TensorFlow checkpoints need to be downloaded manually (using the links in the table below) and converted into PyTorch format (using https://github.com/tonylins/simclr-converter and https://github.com/Separius/SimCLRv2-Pytorch, respectively).

Note 2: In order to convert BYOL, you may need to install some packages by running:

pip install jax jaxlib dill git+https://github.com/deepmind/dm-haiku

Below are links to the pre-trained weights used.

Model URL
InsDis https://www.dropbox.com/sh/87d24jqsl6ra7t2/AACcsSIt1_Njv7GsmsuzZ6Sta/InsDis.pth
MoCo-v1 https://dl.fbaipublicfiles.com/moco/moco_checkpoints/moco_v1_200ep/moco_v1_200ep_pretrain.pth.tar
PCL-v1 https://storage.googleapis.com/sfr-pcl-data-research/PCL_checkpoint/PCL_v1_epoch200.pth.tar
PIRL https://www.dropbox.com/sh/87d24jqsl6ra7t2/AADN4jKnvTI0U5oT6hTmQZz8a/PIRL.pth
PCL-v2 https://storage.googleapis.com/sfr-pcl-data-research/PCL_checkpoint/PCL_v2_epoch200.pth.tar
SimCLR-v1 https://storage.cloud.google.com/simclr-gcs/checkpoints/ResNet50_1x.zip
MoCo-v2 https://dl.fbaipublicfiles.com/moco/moco_checkpoints/moco_v2_800ep/moco_v2_800ep_pretrain.pth.tar
SimCLR-v2 https://console.cloud.google.com/storage/browser/simclr-checkpoints/simclrv2/pretrained/r50_1x_sk0
SeLa-v2 https://dl.fbaipublicfiles.com/deepcluster/selav2_400ep_pretrain.pth.tar
InfoMin https://www.dropbox.com/sh/87d24jqsl6ra7t2/AAAzMTynP3Qc8mIE4XWkgILUa/InfoMin_800.pth
BYOL https://storage.googleapis.com/deepmind-byol/checkpoints/pretrain_res50x1.pkl
DeepCluster-v2 https://dl.fbaipublicfiles.com/deepcluster/deepclusterv2_800ep_pretrain.pth.tar
SwAV https://dl.fbaipublicfiles.com/deepcluster/swav_800ep_pretrain.pth.tar
Supervised We use weights from torchvision.models.resnet50(pretrained=True)

Datasets

There are several classes defined in the datasets directory. The data is expected in a directory name data, located on the same level as this repository. Below is an outline of the expected file structure:

data/
    CIFAR10/
    DTD/
    ...
ssl-transfer/
    datasets/
    models/
    readme.md
    ...

Many-shot (Linear)

We provide the code for our linear evaluation in linear.py.

To evaluate DeepCluster-v2 on CIFAR10 given our pre-computed best regularisation hyperparameter, run:

python linear.py --dataset cifar10 --model deepcluster-v2 --C 0.316

The test accuracy should be close to 94.07%, the value reported in Table 1 of the paper.

To evaluate the Supervised baseline, run:

python linear.py --dataset cifar10 --model supervised --C 0.056

This model should achieve close to 91.47%.

To search for the best regularisation hyperparameter on the validation set, exclude the --C argument:

python linear.py --dataset cifar10 --model supervised

Finally, when using SimCLR-v1 or SimCLR-v2, always use the --no-norm argument:

python linear.py --dataset cifar10 --model simclr-v1 --no-norm

Many-shot (Finetune)

We provide code for finetuning in finetune.py.

To finetune DeepCluster-v2 on CIFAR10, run:

python finetune.py --dataset cifar10 --model deepcluster-v2

This model should achieve close to 97.06%, the value reported in Table 1 of the paper.

Few-shot (Kornblith & CD-FSL)

We provide the code for our few-shot evaluation in few_shot.py.

To evaluate DeepCluster-v2 on EuroSAT in a 5-way 5-shot setup, run:

python few_shot.py --dataset eurosat --model deepcluster-v2 --n-way 5 --n-support 5

The test accuracy should be close to 88.39% ± 0.49%, the value reported in Table 2 of the paper.

Or, to evaluate the Supervised baseline on ChestX in a 5-way 50-shot setup, run:

python few_shot.py --dataset chestx --model supervised --n-way 5 --n-support 50

This model should achieve close to 32.34% ± 0.45%.

Object Detection

We use the detectron2 framework to train our models on PASCAL VOC object detection.

Below is an outline of the expected file structure, including config files, converted models and the detectron2 framework:

detectron2/
    tools/
        train_net.py
        ...
    ...
ssl-transfer/
    detectron2-configs/
        finetune/
            byol.yaml
            ...
        frozen/
            byol.yaml
            ...
    models/
        detectron2/
            byol.pkl
            ...
        ...
    ...

To set it up, perform the following steps:

  1. Install detectron2 (requries PyTorch 1.5 or newer). We expect the installed framework to be located at the same level as this repository, see outline of expected file structure above.
  2. Convert the models into the format used by detectron2 by running python convert_to_detectron2.py. The converted models will be saved in a directory called detectron2 inside the models directory.

We include the config files for the frozen training in detectron2-configs/frozen and for full finetuning in detectron2-configs/finetune. In order to train models, navigate into detectron2/tools/. We can now train e.g. BYOL with a frozen backbone on 1 GPU by running:

./train_net.py --num-gpus 1 --config-file ../../ssl-transfer/detectron2-configs/frozen/byol.yaml OUTPUT_DIR ./output/byol-frozen

This model should achieve close to 82.01 AP50, the value reported in Table 3 of the paper.

Surface Normal Estimation

The code for running the surface normal estimation experiments is given in the surface-normal-estimation. We use the MIT CSAIL Semantic Segmentation Toolkit, but there is also a docker configuration file that can be used to build a container with all the dependencies installed. One can train a model with a command like:

./scripts/train_finetune_models.sh <pretrained-model-path> <checkpoint-directory>

and the resulting model can be evaluated with

./scripts/test_models.sh <checkpoint-directory>

Semantic Segmentation

We also use the same framework performing semantic segmentation. As per the surface normal estimation experiments, we include a docker configuration file to make getting dependencies easier. Before training a semantic segmentation model you will need to change the paths in the relevant YAML configuration file to point to where you have stored the pre-trained models and datasets. Once this is done the training script can be run with, e.g.,

python train.py --gpus 0,1 --cfg selfsupconfig/byol.yaml

where selfsupconfig/byol.yaml is the aforementioned configuration file. The resulting model can be evaluated with

python eval_multipro.py --gpus 0,1 --cfg selfsupconfig/byol.yaml

Citation

If you find our work useful for your research, please consider citing our paper:

@inproceedings{Ericsson2021HowTransfer,
    title = {{How Well Do Self-Supervised Models Transfer?}},
    year = {2021},
    booktitle = {CVPR},
    author = {Ericsson, Linus and Gouk, Henry and Hospedales, Timothy M.},
    url = {http://arxiv.org/abs/2011.13377},
    arxivId = {2011.13377}
}

If you have any questions, feel welcome to create an issue or contact Linus Ericsson ([email protected]).

Owner
Linus Ericsson
PhD student in the Data Science CDT at The University of Edinburgh
Linus Ericsson
MutualGuide is a compact object detector specially designed for embedded devices

Introduction MutualGuide is a compact object detector specially designed for embedded devices. Comparing to existing detectors, this repo contains two

ZHANG Heng 103 Dec 13, 2022
Codeflare - Scale complex AI/ML pipelines anywhere

Scale complex AI/ML pipelines anywhere CodeFlare is a framework to simplify the integration, scaling and acceleration of complex multi-step analytics

CodeFlare 169 Nov 29, 2022
Code for 2021 NeurIPS --- Towards Multi-Grained Explainability for Graph Neural Networks

ReFine: Multi-Grained Explainability for GNNs This is the official code for Towards Multi-Grained Explainability for Graph Neural Networks (NeurIPS 20

Shirley (Ying-Xin) Wu 47 Dec 16, 2022
a dnn ai project to classify which food people are eating on audio recordings

Deep Learning - EAT Challenge About This project is part of an AI challenge of the DeepLearning course 2021 at the University of Augsburg. The objecti

Marco Tröster 1 Oct 24, 2021
Efficient 3D human pose estimation in video using 2D keypoint trajectories

3D human pose estimation in video with temporal convolutions and semi-supervised training This is the implementation of the approach described in the

Meta Research 3.1k Dec 29, 2022
🐦 Quickly annotate data from the comfort of your Jupyter notebook

🐦 pigeon - Quickly annotate data on Jupyter Pigeon is a simple widget that lets you quickly annotate a dataset of unlabeled examples from the comfort

Anastasis Germanidis 647 Jan 05, 2023
Problem-943.-ACMP - Problem 943. ACMP

Problem-943.-ACMP В "main.py" расположен вариант моего решения задачи 943 с серв

Konstantin Dyomshin 2 Aug 19, 2022
The spiritual successor to knockknock for PyTorch Lightning, get notified when your training ends

Who's there? The spiritual successor to knockknock for PyTorch Lightning, to get a notification when your training is complete or when it crashes duri

twsl 70 Oct 06, 2022
PyTorch implementation of PNASNet-5 on ImageNet

PNASNet.pytorch PyTorch implementation of PNASNet-5. Specifically, PyTorch code from this repository is adapted to completely match both my implemetat

Chenxi Liu 314 Nov 25, 2022
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

CPC_audio This code implements the Contrast Predictive Coding algorithm on audio data, as described in the paper Unsupervised Pretraining Transfers we

8 Nov 14, 2022
All of the figures and notebooks for my deep learning book, for free!

"Deep Learning - A Visual Approach" by Andrew Glassner This is the official repo for my book from No Starch Press. Ordering the book My book is called

Andrew Glassner 227 Jan 04, 2023
A graphical Semi-automatic annotation tool based on labelImg and Yolov5

💕YOLOV5 semi-automatic annotation tool (Based on labelImg)

EricFang 247 Jan 05, 2023
This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

EEND-vector clustering The EEND-vector clustering (End-to-End-Neural-Diarization-vector clustering) is a speaker diarization framework that integrates

45 Dec 26, 2022
Semi-supervised Transfer Learning for Image Rain Removal. In CVPR 2019.

Semi-supervised Transfer Learning for Image Rain Removal This package contains the Python implementation of "Semi-supervised Transfer Learning for Ima

Wei Wei 59 Dec 26, 2022
Official PyTorch Implementation of SSMix (Findings of ACL 2021)

SSMix: Saliency-based Span Mixup for Text Classification (Findings of ACL 2021) Official PyTorch Implementation of SSMix | Paper Abstract Data augment

Clova AI Research 52 Dec 27, 2022
Neural Scene Flow Fields using pytorch-lightning, with potential improvements

nsff_pl Neural Scene Flow Fields using pytorch-lightning. This repo reimplements the NSFF idea, but modifies several operations based on observation o

AI葵 178 Dec 21, 2022
This is the official Pytorch implementation of "Lung Segmentation from Chest X-rays using Variational Data Imputation", Raghavendra Selvan et al. 2020

README This is the official Pytorch implementation of "Lung Segmentation from Chest X-rays using Variational Data Imputation", Raghavendra Selvan et a

Raghav 42 Dec 15, 2022
Codebase for ECCV18 "The Sound of Pixels"

Sound-of-Pixels Codebase for ECCV18 "The Sound of Pixels". *This repository is under construction, but the core parts are already there. Environment T

Hang Zhao 318 Dec 20, 2022
Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh

generate_cloud_points Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh. Run python disp_mesh.py Or you

Peng Yu 2 Dec 24, 2021
Blind visual quality assessment on 360° Video based on progressive learning

Blind visual quality assessment on omnidirectional or 360 video (ProVQA) Blind VQA for 360° Video via Progressively Learning from Pixels, Frames and V

5 Jan 06, 2023