official implemntation for "Contrastive Learning with Stronger Augmentations"

Overview

CLSA

CLSA is a self-supervised learning methods which focused on the pattern learning from strong augmentations.

Copyright (C) 2020 Xiao Wang, Guo-Jun Qi

License: MIT for academic use.

Contact: Guo-Jun Qi ([email protected])

Introduction

Representation learning has been greatly improved with the advance of contrastive learning methods. Those methods have greatly benefited from various data augmentations that are carefully designated to maintain their identities so that the images transformed from the same instance can still be retrieved. However, those carefully designed transformations limited us to further explore the novel patterns carried by other transformations. To pave this gap, we propose a general framework called Contrastive Learning with Stronger Augmentations(CLSA) to complement current contrastive learning approaches. As found in our experiments, the distortions induced from the stronger make the transformed images can not be viewed as the same instance any more. Thus, we propose to minimize the distribution divergence between the weakly and strongly augmented images over the representation bank to supervise the retrieval of strongly augmented queries from a pool of candidates. Experiments on ImageNet dataset and downstream datasets showed the information from the strongly augmented images can greatly boost the performance. For example, CLSA achieves top-1 accuracy of 76.2% on ImageNet with a standard ResNet-50 architecture with a single-layer classifier fine-tuned, which is almost the same level as 76.5% of supervised results.

Installation

CUDA version should be 10.1 or higher.

1. Install git

2. Clone the repository in your computer

git clone [email protected]:maple-research-lab/CLSA.git && cd CLSA

3. Build dependencies.

You have two options to install dependency on your computer:

3.1 Install with pip and python(Ver 3.6.9).

3.1.1install pip.
3.1.2 Install dependency in command line.
pip install -r requirements.txt --user

If you encounter any errors, you can install each library one by one:

pip install torch==1.7.1
pip install torchvision==0.8.2
pip install numpy==1.19.5
pip install Pillow==5.1.0
pip install tensorboard==1.14.0
pip install tensorboardX==1.7

3.2 Install with anaconda

3.2.1 install conda.
3.2.2 Install dependency in command line
conda create -n CLSA python=3.6.9
conda activate CLSA
pip install -r requirements.txt 

Each time when you want to run my code, simply activate the environment by

conda activate CLSA
conda deactivate(If you want to exit) 

4 Prepare the ImageNet dataset

4.1 Download the ImageNet2012 Dataset under "./datasets/imagenet2012".
4.2 Go to path "./datasets/imagenet2012/val"
4.3 move validation images to labeled subfolders, using the following shell script

Usage

Unsupervised Training

This implementation only supports multi-gpu, DistributedDataParallel training, which is faster and simpler; single-gpu or DataParallel training is not supported.

Single Crop

1 Without symmetrical loss
python3 main_clsa.py --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 --size_crops 224 96 --min_scale_crops 0.2 0.086 --max_scale_crops 1.0 0.429 --pick_strong 1 --pick_weak 0 --clsa_t 0.2 --sym 0

Here the [data_path] should be the root directory of imagenet dataset.

2 With symmetrical loss (Not verified)
python3 main_clsa.py --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 --size_crops 224 96 --min_scale_crops 0.2 0.086 --max_scale_crops 1.0 0.429 --pick_strong 1 --pick_weak 0 --clsa_t 0.2 --sym 1

Here the [data_path] should be the root directory of imagenet dataset.

Multi Crop

1 Without symmetrical loss
python3 main_clsa.py --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 1 1 1 --size_crops 224 192 160 128 96 --min_scale_crops 0.2 0.172 0.143 0.114 0.086 --max_scale_crops 1.0 0.86 0.715 0.571 0.429 --pick_strong 0 1 2 3 4 --pick_weak 0 1 2 3 4 --clsa_t 0.2 --sym 0

Here the [data_path] should be the root directory of imagenet dataset.

2 With symmetrical loss (Not verified)
python3 main_clsa.py --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 1 1 1 --size_crops 224 192 160 128 96 --min_scale_crops 0.2 0.172 0.143 0.114 0.086 --max_scale_crops 1.0 0.86 0.715 0.571 0.429 --pick_strong 0 1 2 3 4 --pick_weak 0 1 2 3 4 --clsa_t 0.2 --sym 1

Here the [data_path] should be the root directory of imagenet dataset.

Linear Classification

With a pre-trained model, we can easily evaluate its performance on ImageNet with:

python3 lincls.py --data=./datasets/imagenet2012 --dist-url=tcp://localhost:10001 --pretrained=[pretrained_model_path]

[pretrained_model_path] should be the Imagenet pretrained model path.

Performance:

pre-train
network
pre-train
epochs
Crop CLSA
top-1 acc.
Model
Link
ResNet-50 200 Single 69.4 model
ResNet-50 200 Multi 73.3 model
ResNet-50 800 Single 72.2 model
ResNet-50 800 Multi 76.2 None

Really sorry that we can't provide CLSA* 800 epochs' model, which is because that we train it with 32 internal GPUs and we can't download it because of company regulations. For downstream tasks, we found multi-200epoch model also had similar performance. Thus, we suggested you to use this model for downstream purposes.

Transfering to VOC07 Classification

1 Download Dataset under "./datasets/voc"

2 Linear Evaluation:

cd VOC_CLF
python3 main.py --data=[VOC_dataset_dir] --pretrained=[pretrained_model_path]

Here VOC directory should be the directory includes "vockit" directory; [VOC_dataset_dir] is the VOC dataset path; [pretrained_model_path] is the imagenet pretrained model path.

Transfer to Object Detection

1. Install detectron2.

2. Convert a pre-trained CLSA model to detectron2's format:

# in detection folder
python3 convert-pretrain-to-detectron2.py input.pth.tar output.pkl

3. download VOC Dataset and COCO Dataset under "./detection/datasets" directory,

following the directory structure requried by detectron2.

4. Run training:

4.1 Pascal detection
cd detection
python train_net.py --config-file configs/pascal_voc_R_50_C4_24k_CLSA.yaml  --num-gpus 8 MODEL.WEIGHTS ./output.pkl
4.2 COCO detection
   cd detection
   python train_net.py --config-file configs/coco_R_50_C4_2x_clsa.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl

Citation:

Contrastive Learning with Stronger Augmentations

@article{wang2021CLSA,
  title={Contrastive Learning with Stronger Augmentations},
  author={Wang, Xiao and Qi, Guo-Jun},
  journal={arXiv preprint arXiv:},
  year={2021}
}
Owner
Lab for MAchine Perception and LEarning (MAPLE)
Lab for MAchine Perception and LEarning (MAPLE)
Convert Python 3 code to CUDA code.

Py2CUDA Convert python code to CUDA. Usage To convert a python file say named py_file.py to CUDA, run python generate_cuda.py --file py_file.py --arch

Yuval Rosen 3 Jul 14, 2021
A whale detector design for the Kaggle whale-detector challenge!

CNN (InceptionV1) + STFT based Whale Detection Algorithm So, this repository is my PyTorch solution for the Kaggle whale-detection challenge. The obje

Tarin Ziyaee 92 Sep 28, 2021
Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Phil Wang 383 Jan 02, 2023
Fast and robust clustering of point clouds generated with a Velodyne sensor.

Depth Clustering This is a fast and robust algorithm to segment point clouds taken with Velodyne sensor into objects. It works with all available Velo

Photogrammetry & Robotics Bonn 957 Dec 21, 2022
Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation

Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation [Arxiv] [Video] Evaluation code for Unrestricted Facial Geometry Reconstr

Matan Sela 242 Dec 30, 2022
Distance Encoding for GNN Design

Distance-encoding for GNN design This repository is the official PyTorch implementation of the DEGNN and DEAGNN framework reported in the paper: Dista

172 Nov 08, 2022
Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)

Pytorch Code for VideoLT [Website][Paper] Updates [10/29/2021] Features uploaded to Google Drive, for access please send us an e-mail: zhangxing18 at

Skye 26 Sep 18, 2022
A simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)

this is a simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)

crispengari 5 Dec 09, 2021
UPSNet: A Unified Panoptic Segmentation Network

UPSNet: A Unified Panoptic Segmentation Network Introduction UPSNet is initially described in a CVPR 2019 oral paper. Disclaimer This repository is te

Uber Research 622 Dec 26, 2022
List of all dependencies affected by node-ipc malicious commit

node-ipc-dependencies-list List of all dependencies affected by node-ipc malicious commit as of 17/3/2022 - 19/3/2022 (timestamp) Please improve upon

99 Oct 15, 2022
Deep Halftoning with Reversible Binary Pattern

Deep Halftoning with Reversible Binary Pattern ICCV Paper | Project Website | BibTex Overview Existing halftoning algorithms usually drop colors and f

Menghan Xia 17 Nov 22, 2022
MediaPipe is a an open-source framework from Google for building multimodal

MediaPipe is a an open-source framework from Google for building multimodal (eg. video, audio, any time series data), cross platform (i.e Android, iOS, web, edge devices) applied ML pipelines. It is

Bhavishya Pandit 3 Sep 30, 2022
An OpenAI Gym environment for multi-agent car racing based on Gym's original car racing environment.

Multi-Car Racing Gym Environment This repository contains MultiCarRacing-v0 a multiplayer variant of Gym's original CarRacing-v0 environment. This env

Igor Gilitschenski 56 Nov 01, 2022
PyTorch-Geometric Implementation of MarkovGNN: Graph Neural Networks on Markov Diffusion

MarkovGNN This is the official PyTorch-Geometric implementation of MarkovGNN paper under the title "MarkovGNN: Graph Neural Networks on Markov Diffusi

HipGraph: High-Performance Graph Analytics and Learning 6 Sep 23, 2022
Efficient Householder transformation in PyTorch

Efficient Householder Transformation in PyTorch This repository implements the Householder transformation algorithm for calculating orthogonal matrice

Anton Obukhov 49 Nov 20, 2022
Picasso: a methods for embedding points in 2D in a way that respects distances while fitting a user-specified shape.

Picasso Code to generate Picasso embeddings of any input matrix. Picasso maps the points of an input matrix to user-defined, n-dimensional shape coord

Pachter Lab 45 Dec 23, 2022
[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

Chen XiaoKang 387 Jan 08, 2023
A Python training and inference implementation of Yolov5 helmet detection in Jetson Xavier nx and Jetson nano

yolov5-helmet-detection-python A Python implementation of Yolov5 to detect head or helmet in the wild in Jetson Xavier nx and Jetson nano. In Jetson X

12 Dec 05, 2022
Sparse-dense operators implementation for Paddle

Sparse-dense operators implementation for Paddle This module implements coo, csc and csr matrix formats and their inter-ops with dense matrices. Feel

北海若 3 Dec 17, 2022
Automatic tool focused on deriving metallicities of open clusters

metalcode Automatic tool focused on deriving metallicities of open clusters. Based on the method described in Pöhnl & Paunzen (2010, https://ui.adsabs

2 Dec 13, 2021