Official Pytorch implementation for video neural representation (NeRV)

Last update: Dec 28, 2022

Related tags

Overview

NeRV: Neural Representations for Videos (NeurIPS 2021)

Project Page | Paper | UVG Data

Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava
This is the official implementation of the paper "NeRV: Neural Representations for Videos ".

Get started

We run with Python 3.8, you can set up a conda environment with all dependencies like so:

pip install -r requirements.txt

High-Level structure

The code is organized as follows:

train_nerv.py includes a generic traiing routine.
model_nerv.py contains the dataloader and neural network architecure
data/ directory video/imae dataset, we provide big buck bunny here
checkpoint/ directory contains some pre-trained model on big buck bunny dataset
log files (tensorboard, txt, state_dict etc.) will be saved in output directory (specified by --outf)

Reproducing experiments

Training experiments

The NeRV-S experiment on 'big buck bunny' can be reproduced with

python train_nerv.py -e 300 --cycles 1  --lower-width 96 --num-blocks 1 --dataset bunny --frame_gap 1 \
    --outf bunny_ab --embed 1.25_40 --stem_dim_num 512_1  --reduction 2  --fc_hw_dim 9_16_26 --expansion 1  \
    --single_res --loss Fusion6   --warmup 0.2 --lr_type cosine  --strides 5 2 2 2 2  --conv_type conv \
    -b 1  --lr 0.0005 --norm none --act swish

Evaluation experiments

To evaluate pre-trained model, just add --eval_Only and specify model path with --weight, you can specify model quantization with --quant_bit [bit_lenght], yuo can test decoding speed with --eval_fps, below we preovide sample commends for NeRV-S on bunny dataset

python train_nerv.py -e 300 --cycles 1  --lower-width 96 --num-blocks 1 --dataset bunny --frame_gap 1 \
    --outf bunny_ab --embed 1.25_40 --stem_dim_num 512_1  --reduction 2  --fc_hw_dim 9_16_26 --expansion 1  \
    --single_res --loss Fusion6   --warmup 0.2 --lr_type cosine  --strides 5 2 2 2 2  --conv_type conv \
    -b 1  --lr 0.0005 --norm none  --act swish \
    --weight checkpoints/nerv_S.pth --eval_only

Dump predictions with pre-trained model

To evaluate pre-trained model, just add --eval_Only and specify model path with --weight

python train_nerv.py -e 300 --cycles 1  --lower-width 96 --num-blocks 1 --dataset bunny --frame_gap 1 \
    --outf bunny_ab --embed 1.25_40 --stem_dim_num 512_1  --reduction 2  --fc_hw_dim 9_16_26 --expansion 1  \
    --single_res --loss Fusion6   --warmup 0.2 --lr_type cosine  --strides 5 2 2 2 2  --conv_type conv \
    -b 1  --lr 0.0005 --norm none  --act swish \
   --weight checkpoints/nerv_S.pth --eval_only  --dump_images

Citation

If you find our work useful in your research, please cite:

@inproceedings{hao2021nerv,
    author = {Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava },
    title = {NeRV: Neural Representations for Videos s},
    booktitle = {NeurIPS},
    year={2021}
}

Contact

If you have any questions, please feel free to email the authors.

Official Pytorch implementation for video neural representation (NeRV)

Related tags

Overview

NeRV: Neural Representations for Videos (NeurIPS 2021)

Project Page | Paper | UVG Data

Get started

High-Level structure

Reproducing experiments

Training experiments

Evaluation experiments

Dump predictions with pre-trained model

Citation

Contact

Owner

hao

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

This is a collection of all challenges in HKCERT CTF 2021

Building a real-time environment using webcam frame division in OpenCV and classify cropped images using a fine-tuned vision transformers on hybryd datasets samples for facial emotion recognition.

Face recognize system

tinykernel - A minimal Python kernel so you can run Python in your Python

Keras code and weights files for popular deep learning models.

(CVPR2021) ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Code in both PyTorch and TensorFlow

Implicit Model Specialization through DAG-based Decentralized Federated Learning

Python scripts for performing lane detection using the LSTR model in ONNX

This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

Advancing mathematics by guiding human intuition with AI

The repository offers the official implementation of our paper in PyTorch.

Scalable, event-driven, deep-learning-friendly backtesting library

Python-based Informatics Kit for Analysing Chemical Units

A supplementary code for Editable Neural Networks, an ICLR 2020 submission.

PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."