[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

Last update: Oct 26, 2022

Related tags

Deep Learning SSVC

Overview

SSVC

The source code for paper [Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning]

samples of the generated motion-preserved video with threshold $\alpha=0.5$.

Requirements

python3
torch1.1+
PIL
FrEIA==0.2 (Flow-based model)
lintel==1.0 (Decode mp4 videos on the fly)

Structure

backbone
data
- lists: train/val lists (.txt)
- augmentation.py: train/val data augmentation during ssl pre-training
- vDataLoader.py: custom your path to data list
model
- advflow: flow-based model
- classifier.py: linear classifier for down-stream tasks
- infonce.py: combine S$^2$VC with MoCo
flow
- pre-trained flow-based model weights
utils
main_pretrain.py: the main function for self-supervised pretrain
main_eval.py: the main function for supervised fine-tune

Self-supervised Pretrain

DDP

python -m torch.distributed.launch --nproc_per_node=1 --master_port 1234 main_pretrain.py --net r3d18 --img_dim 112 --seq_len 16 --aug_type 1 -t 0.5 -bsz 64 --gpu 0,1 --dataset XX

Single GPU

python main_pretrain.py --net r3d18 --img_dim 112 --seq_len 16 --aug_type 1 -t 0.5 -bsz 64 --gpu 0 --dataset XX

Evaluation

NN-Retrieval

python main_eval.py --retrieval --test SSL_Pt_Model_PTH --dataset XX --gpu X

Finetune

# fine-tune overall model
python main_eval.py --train_what ft --pretrain SSL_Pt_Model_PTH --dataset XX --gpu XX \
--net r3d18 --img_dim 224 --seq_len 32

# freeze backbone, finetune last layer
python main_eval.py --train_what last --pretrain SSL_Pt_Model_PTH --dataset XX --gpu XX \
--net r3d18 --img_dim 224 --seq_len 32

Test

python main_eval.py --train_what XX --ten_crop --test Sup_Ft_Model_PTH --gpu X \
--dataset XX --net r3d18 --img_dim 224 --seq_len 32

[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

Related tags

Overview

SSVC

Requirements

Structure

Self-supervised Pretrain

DDP

Single GPU

Evaluation

NN-Retrieval

Finetune

Test

Owner

TensorFlow (Python API) implementation of Neural Style

PyTorch implementation of MuseMorphose, a Transformer-based model for music style transfer.

A no-BS, dead-simple training visualizer for tf-keras

Simple codebase for flexible neural net training

PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners

Control-Raspberry-Pi-Robot-using-Hand-Gestures - A 4WD Robot car based on Raspberry Pi that controlled by hand gestures(using openCV and mediapipe)

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks

Deep and online learning with spiking neural networks in Python

Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

Multiple-criteria decision-making (MCDM) with Electre, Promethee, Weighted Sum and Pareto

Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.

Pytorch code for our paper "Feedback Network for Image Super-Resolution" (CVPR2019)

Backend code to use MCPI's python API to make infinite worlds with custom generation

An official implementation of MobileStyleGAN in PyTorch

A curated list of references for MLOps

An efficient PyTorch implementation of the evaluation metrics in recommender systems.

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Application of K-means algorithm on a music dataset after a dimensionality reduction with PCA