This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding)

Related tags

Deep LearningHCSC
Overview

HCSC: Hierarchical Contrastive Selective Coding

This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding), whose details are in this paper.

HCSC is an effective and efficient method to pre-train image encoders in a self-supervised fashion. In general, this method seeks to learn image representations with hierarchical semantic structures. It utilizes hierarchical K-means to derive hierarchical prototypes, and these prototypes represent the hierarchical semantics underlying the data. On such basis, we perform Instance-wise and Prototypical Contrastive Selective Coding to inject the information within hierarchical prototypes into image representations. HCSC has achieved SOTA performance on the self-supervised pre-training of CNNs (e.g., ResNet-50), and we will further study its potential on pre-training Vision Transformers.

Roadmap

  • [2022/02/01] The initial release! We release all source code for pre-training and downstream evaluation. We release three pre-trained ResNet-50 models: 200 epochs (single-crop), 200 epochs (multi-crop) and 400 epochs (single-crop, batch size: 256).

TODO

  • Finish the pre-training of 400 epochs ResNet-50 models (multi-crop) and release.
  • Finish the pre-training of 800 epochs ResNet-50 models (single- & multi-crop) and release.
  • Support Vision Transformer backbones.
  • Pre-train Vision Transformers with HCSC and release model weights under various configurations.

Model Zoo

We will continually release our pre-trained HCSC model weights and corresponding training configs. The current finished ones are as follows:

Backbone Method Crop Epoch Batch size Lincls top-1 Acc. KNN top-1 Acc. url config
ResNet-50 HCSC Single 200 256 69.2 60.7 model config
ResNet-50 HCSC Multi 200 256 73.3 66.6 model config
ResNet-50 HCSC Single 400 256 70.6 63.4 model config

Installation

Use following command to install dependencies (python3.7 with pip installed):

pip3 install -r requirement.txt

If having trouble installing PyTorch, follow the original guidance (https://pytorch.org/). Notably, the code is tested with cudatoolkit version 10.2.

Pre-training on ImageNet

Download ImageNet dataset under [ImageNet Folder]. Go to the path "[ImageNet Folder]/val" and use this script to build sub-folders.

To train single-crop HCSC on 8 Tesla-V100-32GB GPUs for 200 epochs, run:

python3 -m torch.distributed.launch --master_port [your port] --nproc_per_node=8 \
pretrain.py [your ImageNet Folder]

To train multi-crop HCSC on 8 Tesla-V100-32GB GPUs for 200 epochs, run:

python3 -m torch.distributed.launch --master_port [your port] --nproc_per_node=8 \
pretrain.py --multicrop [your ImageNet Folder]

Downstream Evaluation

Evaluation: Linear Classification on ImageNet

With a pre-trained model, to train a supervised linear classifier with all available GPUs, run:

python3 eval_lincls_imagenet.py --data [your ImageNet Folder] \
--dist-url tcp://localhost:10001 --world-size 1 --rank 0 \
--pretrained [your pre-trained model (example:out.pth)]

Evaluation: KNN Evaluation on ImageNet

To reproduce the KNN evaluation results with a pre-trained model using a single GPU, run:

python3 -m torch.distributed.launch --master_port [your port] --nproc_per_node=1 eval_knn.py \
--checkpoint_key state_dict \
--pretrained [your pre-trained model] \
--data [your ImageNet Folder]

Evaluation: Semi-supervised Learning on ImageNet

To fine-tune a pre-trained model with 1% or 10% ImageNet labels with 8 Tesla-V100-32GB GPUs, run:

1% of labels:

python3 -m torch.distributed.launch --nproc_per_node 8 --master_port [your port] eval_semisup.py \
--labels_perc 1 \
--pretrained [your pretrained weights] \
[your ImageNet Folder]

10% of labels:

python3 -m torch.distributed.launch --nproc_per_node 8 --master_port [your port] eval_semisup.py \
--labels_perc 10 \
--pretrained [your pretrained weights] \
[your ImageNet Folder]

Evaluation: Transfer Learning - Classification on VOC / Places205

VOC

1. Download the VOC dataset.
2. Finetune and evaluate on PASCAL VOC (with a single GPU):
cd voc_cls/ 
python3 main.py --data [your voc data folder] \
--pretrained [your pretrained weights]

Places205

1. Download the Places205 dataset (resized 256x256 version)
2. Linear Classification on Places205 (with all available GPUs):
python3 eval_lincls_places.py --data [your places205 data folder] \
--data-url tcp://localhost:10001 \
--pretrained [your pretrained weights]

Evaluation: Transfer Learning - Object Detection on VOC / COCO

1. Download VOC and COCO Dataset (under ./detection/datasets).

2. Install detectron2.

3. Convert a pre-trained model to the format of detectron2:

cd detection
python3 convert-pretrain-to-detectron2.py [your pretrained weight] out.pkl

4. Train on PASCAL VOC/COCO:

Finetune and evaluate on VOC (with 8 Tesla-V100-32GB GPUs):
cd detection
python3 train_net.py --config-file ./configs/pascal_voc_R_50_C4_24k_hcsc.yaml \
--num-gpus 8 MODEL.WEIGHTS out.pkl
Finetune and evaluate on COCO (with 8 Tesla-V100-32GB GPUs):
cd detection
python3 train_net.py --config-file ./configs/coco_R_50_C4_2x_hcsc.yaml \
--num-gpus 8 MODEL.WEIGHTS out.pkl

Evaluation: Clustering Evaluation on ImageNet

To reproduce the clustering evaluation results with a pre-trained model using all available GPUs, run:

python3 eval_clustering.py --dist-url tcp://localhost:10001 \
--multiprocessing-distributed --world-size 1 --rank 0 \
--num-cluster [target num cluster] \
--pretrained [your pretrained model weights] \
[your ImageNet Folder]

In the experiments of our paper, we set --num-cluster as 25000 and 1000.

License

This repository is released under the MIT license as in the LICENSE file.

Citation

If you find this repository useful, please kindly consider citing the following paper:

@article{guo2022hcsc,
  title={HCSC: Hierarchical Contrastive Selective Coding},
  author={Guo, Yuanfan and Xu, Minghao and Li, Jiawen and Ni, Bingbing and Zhu, Xuanyu and Sun, Zhenbang and Xu, Yi},
  journal={arXiv preprint arXiv:2202.00455},
  year={2022}
}
Owner
YUANFAN GUO
From SJTU. Working on self-supervised pre-training.
YUANFAN GUO
Answering Open-Domain Questions of Varying Reasoning Steps from Text

This repository contains the authors' implementation of the Iterative Retriever, Reader, and Reranker (IRRR) model in the EMNLP 2021 paper "Answering Open-Domain Questions of Varying Reasoning Steps

26 Dec 22, 2022
Links to works on deep learning algorithms for physics problems, TUM-I15 and beyond

Links to works on deep learning algorithms for physics problems, TUM-I15 and beyond

Nils Thuerey 1.3k Jan 08, 2023
GEA - Code for Guided Evolution for Neural Architecture Search

Efficient Guided Evolution for Neural Architecture Search Usage Create a conda e

6 Jan 03, 2023
Improving Object Detection by Label Assignment Distillation

Improving Object Detection by Label Assignment Distillation This is the official implementation of the WACV 2022 paper Improving Object Detection by L

Cybercore Co. Ltd 51 Dec 08, 2022
JAX + dataclasses

jax_dataclasses jax_dataclasses provides a wrapper around dataclasses.dataclass for use in JAX, which enables automatic support for: Pytree registrati

Brent Yi 35 Dec 21, 2022
Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Duong H. Le 18 Jun 13, 2022
This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR

This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR,which is an open-source toolbox based on PyTorch. The overall architecture will be sh

Jianquan Ye 82 Nov 17, 2022
Facial expression detector

A tensorflow convolutional neural network model to detect facial expressions.

Carlos Tardón Rubio 5 Apr 20, 2022
This is the official code of our paper "Diversity-based Trajectory and Goal Selection with Hindsight Experience Relay" (PRICAI 2021)

Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay This is the official implementation of our paper "Diversity-based Traje

Tianhong Dai 6 Jul 18, 2022
Pytorch implementation code for [Neural Architecture Search for Spiking Neural Networks]

Neural Architecture Search for Spiking Neural Networks Pytorch implementation code for [Neural Architecture Search for Spiking Neural Networks] (https

Intelligent Computing Lab at Yale University 28 Nov 18, 2022
RANZCR-CLiP 7th Place Solution

RANZCR-CLiP 7th Place Solution This repository is WIP. (18 Mar 2021) Installation git clone https://github.com/analokmaus/kaggle-ranzcr-clip-public.gi

Hiroshechka Y 21 Oct 22, 2022
Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

Rainbow 🌈 An implementation of Rainbow DQN which outperforms the paper's (Hessel et al. 2017) results on 40% of tested games while using 20x less dat

Dominik Schmidt 31 Dec 21, 2022
Code accompanying "Adaptive Methods for Aggregated Domain Generalization"

Adaptive Methods for Aggregated Domain Generalization (AdaClust) Official Pytorch Implementation of Adaptive Methods for Aggregated Domain Generalizat

Xavier Thomas 15 Sep 20, 2022
Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible, to be the most reliable with the least com

Nikolas B Virionis 2 Aug 01, 2022
なりすまし検出(anti-spoof-mn3)のWebカメラ向けデモ

FaceDetection-Anti-Spoof-Demo なりすまし検出(anti-spoof-mn3)のWebカメラ向けデモです。 モデルはPINTO_model_zoo/191_anti-spoof-mn3からONNX形式のモデルを使用しています。 Requirement mediapipe

KazuhitoTakahashi 8 Nov 18, 2022
This is a demo app to be used in the video streaming applications

MoViDNN: A Mobile Platform for Evaluating Video Quality Enhancement with Deep Neural Networks MoViDNN is an Android application that can be used to ev

ATHENA Christian Doppler (CD) Laboratory 7 Jul 21, 2022
Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences forImage-Text Retrieval

NSGDC Some codes in this repo are copied/modified from opensource implementations made available by UNITER, PyTorch, HuggingFace, OpenNMT, and Nvidia.

Zhihao Fan 2 Nov 07, 2022
using STGCN to achieve egg classification task

EEG Classification   The task requires us to classify electroencephalography(EEG) into six categories, including human body, human face, animal body,

4 Jun 13, 2022
Convolutional Neural Network for Text Classification in Tensorflow

This code belongs to the "Implementing a CNN for Text Classification in Tensorflow" blog post. It is slightly simplified implementation of Kim's Convo

Denny Britz 5.5k Jan 02, 2023
Generate images from texts. In Russian

ruDALL-E Generate images from texts pip install rudalle==1.1.0rc0 🤗 HF Models: ruDALL-E Malevich (XL) ruDALL-E Emojich (XL) (readme here) ruDALL-E S

AI Forever 1.6k Dec 31, 2022