[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

Overview

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models

License: MIT

Codes for this paper The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models. [CVPR 2021]

Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang.

Overview

Can we aggressively trim down the complexity of pre-trained models, without damaging their downstream transferability?

Transfer Learning for Winning Tickets from Supervised and Self-supervised Pre-training

Downstream classification tasks.

Downstream detection and segmentation tasks.

Properties of Pre-training Tickets

Reproduce

Preliminary

Required environment:

  • pytorch >= 1.5.0
  • torchvision

Pre-trained Models

Pre-trained models are provided here.

imagenet_weight.pt # torchvision std model

moco.pt # pretrained moco v2 model (only contain encorder_q)

moco_v2_800ep_pretrain.pth.tar # pretrained moco v2 model (contain encorder_q&k)

simclr_weight.pt # (pretrained_simclr weight)

Task-Specific Tickets Finding

Remark. for both pre-training tasks and downstream tasks.

Iterative Magnitude Pruning

SimCLR task
cd SimCLR 
python -u main.py \
    [experiment name] \ 
    --gpu 0,1,2,3 \    
    --epochs 180 \
    --prun_epoch 10 \ # pruning for ( 1 + 180/10 iterations)
    --prun_percent 0.2 \
    --lr 1e-4 \
    --arch resnet50 \
    --batch_size 256 \
    --data [data direction] \
    --sim_model [pretrained_simclr_model] \
    --save_dir simclr_imp
MoCo task
cd MoCo
CUDA_VISIBLE_DEVICES=0,1,2,3 python -u main_moco_imp.py \
	[Dataset Direction] \
	--pretrained_path [pretrained_moco_model] \
    -a resnet50 \
    --batch-size 256 \
    --dist-url 'tcp://127.0.0.1:5234' \
    --multiprocessing-distributed \
    --world-size 1 \
    --rank 0 \
    --mlp \
    --moco-t 0.2 \
    --aug-plus \
    --cos \
    --epochs 180 \
    --retrain_epoch 10 \ # pruning for ( 1 + 180/10 iterations)
    --save_dir moco_imp
Classification task on ImageNet
CUDA_VISIBLE_DEVICES=0,1,2,3 python -u main_imp_imagenet.py \
	[Dataset Direction] \
	-a resnet50 \
	--epochs 10 \
	-b 256 \
	--lr 1e-4 \
	--states 19 \ # iterative pruning times 
	--save_dir imagenet_imp
Classification task on Visda2017
CUDA_VISIBLE_DEVICES=0,1,2,3 python -u main_imp_visda.py \
	[Dataset Direction] \
	-a resnet50 \
	--epochs 20 \
	-b 256 \
	--lr 0.001 \
	--prune_type lt \ # lt or pt_trans
	--pre_weight [pretrained weight] \ # if pt_trans else None
	--states 19 \ # iterative pruning times
	--save_dir visda_imp
Classification task on small dataset
CUDA_VISIBLE_DEVICES=0 python -u main_imp_downstream.py \
	--data [dataset direction] \
	--dataset [dataset name] \#cifar10, cifar100, svhn, fmnist 
	--arch resnet50 \
	--pruning_times 19 \
	--prune_type [lt, pt, rewind_lt, pt_trans] \
	--save_dir imp_downstream \
	# --pretrained [pretrained weight if prune_type==pt_trans] \
	# --random_prune [if using random pruning] \
    # --rewind_epoch [rewind weight epoch if prune_type==rewind_lt] \

Transfer to Downstream Tasks

Small datasets: (e.g., CIFAR-10, CIFAR-100, SVHN, Fashion-MNIST)
CUDA_VISIBLE_DEVICES=0 python -u main_eval_downstream.py \
	--data [dataset direction] \
	--dataset [dataset name] \#cifar10, cifar100, svhn, fmnist 
	--arch resnet50 \
	--save_dir [save_direction] \
	--pretrained [init weight] \
	--dict_key state_dict [ dict_key in pretrained file, None means load all ] \
	--mask_dir [mask for ticket] \
	--reverse_mask \ #if want to reverse mask
Visda2017:
CUDA_VISIBLE_DEVICES=0,1,2,3 python -u main_eval_visda.py \
	[data direction] \
	-a resnet50 \
	--epochs 20 \
	-b 256 \
	--lr 0.001 \
	--save_dir [save_direction] \
	--pretrained [init weight] \
	--dict_key state_dict [ dict_key in pretrained file, None means load all ] \
	--mask_dir [mask for ticket] \
	--reverse_mask \ #if want to reverse mask

Detection and Segmentation Experiments

Detials of YOLOv4 for detection are collected here.

Detials of DeepLabv3+ for segmentation are collected here.

Citation

@article{chen2020lottery,
  title={The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models},
  author={Chen, Tianlong and Frankle, Jonathan and Chang, Shiyu and Liu, Sijia and Zhang, Yang and Carbin, Michael and Wang, Zhangyang},
  journal={arXiv preprint arXiv:2012.06908},
  year={2020}
}

Acknowledgement

https://github.com/google-research/simclr

https://github.com/facebookresearch/moco

https://github.com/VainF/DeepLabV3Plus-Pytorch

https://github.com/argusswift/YOLOv4-pytorch

https://github.com/yczhang1017/SSD_resnet_pytorch/tree/master

Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
VLG-Net: Video-Language Graph Matching Networks for Video Grounding

VLG-Net: Video-Language Graph Matching Networks for Video Grounding Introduction Official repository for VLG-Net: Video-Language Graph Matching Networ

Mattia Soldan 25 Dec 04, 2022
Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition

Light-SERNet This is the Tensorflow 2.x implementation of our paper "Light-SERNet: A lightweight fully convolutional neural network for speech emotion

Arya Aftab 29 Nov 12, 2022
Active learning for Mask R-CNN in Detectron2

MaskAL - Active learning for Mask R-CNN in Detectron2 Summary MaskAL is an active learning framework that automatically selects the most-informative i

49 Dec 20, 2022
Medical image analysis framework merging ANTsPy and deep learning

ANTsPyNet A collection of deep learning architectures and applications ported to the python language and tools for basic medical image processing. Bas

Advanced Normalization Tools Ecosystem 118 Dec 24, 2022
Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

5 Nov 21, 2022
Training DiffWave using variational method from Variational Diffusion Models.

Variational DiffWave Training DiffWave using variational method from Variational Diffusion Models. Quick Start python train_distributed.py discrete_10

Chin-Yun Yu 26 Dec 13, 2022
⚓ Eurybia monitor model drift over time and securize model deployment with data validation

View Demo · Documentation · Medium article 🔍 Overview Eurybia is a Python library which aims to help in : Detecting data drift and model drift Valida

MAIF 172 Dec 27, 2022
Code Release for Learning to Adapt to Evolving Domains

EAML Code release for "Learning to Adapt to Evolving Domains" (NeurIPS 2020) Prerequisites PyTorch = 0.4.0 (with suitable CUDA and CuDNN version) tor

23 Dec 07, 2022
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

249 Dec 28, 2022
Source code for the NeurIPS 2021 paper "On the Second-order Convergence Properties of Random Search Methods"

Second-order Convergence Properties of Random Search Methods This repository the paper "On the Second-order Convergence Properties of Random Search Me

Adamos Solomou 0 Nov 13, 2021
An official TensorFlow implementation of “CLCC: Contrastive Learning for Color Constancy” accepted at CVPR 2021.

CLCC: Contrastive Learning for Color Constancy (CVPR 2021) Yi-Chen Lo*, Chia-Che Chang*, Hsuan-Chao Chiu, Yu-Hao Huang, Chia-Ping Chen, Yu-Lin Chang,

Yi-Chen (Howard) Lo 58 Dec 17, 2022
Real-Time Semantic Segmentation in Mobile device

Real-Time Semantic Segmentation in Mobile device This project is an example project of semantic segmentation for mobile real-time app. The architectur

708 Jan 01, 2023
Encode and decode text application

Text Encoder and Decoder Encode and decode text in many ways using this application! Encode in: ASCII85 Base85 Base64 Base32 Base16 Url MD5 Hash SHA-1

Alice 1 Feb 12, 2022
Parameter-ensemble-differential-evolution - Shows how to do parameter ensembling using differential evolution.

Ensembling parameters with differential evolution This repository shows how to ensemble parameters of two trained neural networks using differential e

Sayak Paul 9 May 04, 2022
Arbitrary Distribution Modeling with Censorship in Real Time 59 2 60 3 Bidding Advertising for KDD'21

Arbitrary_Distribution_Modeling This repo implements the Neighborhood Likelihood Loss (NLL) and Arbitrary Distribution Modeling (ADM, with Interacting

7 Jan 03, 2023
A complete, self-contained example for training ImageNet at state-of-the-art speed with FFCV

ffcv ImageNet Training A minimal, single-file PyTorch ImageNet training script designed for hackability. Run train_imagenet.py to get... ...high accur

FFCV 92 Dec 31, 2022
SpecAugmentPyTorch - A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

SpecAugment An implementation of SpecAugment for Pytorch How to use Install pytorch, version=1.9.0 (new feature (torch.Tensor.take_along_dim) is used

IMLHF 3 Oct 11, 2022
Image-popularity-score - A novel deep regression method for image scoring.

Image-popularity-score - A novel deep regression method for image scoring.

Shoaib ahmed 1 Dec 26, 2021
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

Tadas Baltrusaitis 5.8k Dec 31, 2022
img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in si

Vítor Albiero 519 Dec 29, 2022