ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

Last update: Nov 25, 2022

Related tags

Overview

ViSER

Installation with conda

conda env create -f viser.yml
conda activate viser-release
# install softras
cd third_party/softras; python setup.py install; cd -;
# install manifold remeshing
git clone --recursive git://github.com/hjwdzh/Manifold; cd Manifold; mkdir build; cd build; cmake .. -DCMAKE_BUILD_TYPE=Release;make -j8; cd ../../

Data preparation

Create folders to store intermediate data and training logs

mkdir log; mkdir tmp;

Download pre-processed data (rgb, mask, flow) following the link here and unzip under ./database/DAVIS/. The dataset is organized as:

DAVIS/
    Annotations/
        Full-Resolution/
            sequence-name/
                {%05d}.png
    JPEGImages/
        Full-Resolution/
            sequence-name/
                {%05d}.jpg
    FlowBW/ and FlowFw/
        Full-Resolution/
            sequence-name/ and optionally seqname-name_{%02d}/ (frame interval)
                flo-{%05d}.pfm
                occ-{%05d}.pfm
                visflo-{%05d}.jpg
                warp-{%05d}.jpg

To run preprocessing scripts on other videos, see install.md.

Example: breakdance-flare

Run

bash scripts/template.sh breakdance-flare

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_result.sh breakdance-flare log/breakdance-flare-1003-ft2/pred_net_20.pth 36

Example outputs:

Example: elephants

Run

bash scripts/relephant-walk.sh

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_elephants.sh log/elephant-walk-1003-6/pred_net_10.pth

Additional Notes

Distributed training

The current codebase supports single-node multi-gpu training with pytorch distributed data-parallel. Please modify dev and ngpu in scripts/template.sh to select devices.

Potential bugs

When setting batch_size to 3, rendered flow may become constant values.

Acknowledgement

The code borrows the skeleton of CMR

External repos:

Citation

To cite our paper

@inproceedings{yang2021viser,
  title={ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Liu, Ce
      and Ramanan, Deva},
  booktitle = {NeurIPS},
  year={2021}
}

@inproceedings{yang2021lasr,
  title={LASR: Learning Articulated Shape Reconstruction from a Monocular Video},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Chang, Huiwen
      and Ramanan, Deva
      and Freeman, William T
      and Liu, Ce},
  booktitle={CVPR},
  year={2021}
}

TODO

data pre-processing scripts
evaluation data and scripts
code clean up

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

Related tags

Overview

ViSER

Installation with conda

Data preparation

Example: breakdance-flare

Example: elephants

Additional Notes

Acknowledgement

Citation

TODO

Owner

Gengshan Yang

[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

Here is the diagnostic tool for BMVC 2021 paper Diagnosing Errors in Video Relation Detectors.

Franka Emika Panda manipulator kinematics&dynamics simulation

Adversarial examples to the new ConvNeXt architecture

Multi-Task Learning as a Bargaining Game

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021

Language Used: Python . Made in Jupyter(Anaconda) notebook.

Vertex AI: Serverless framework for MLOPs (ESP / ENG)

Liver segmentation using MONAI and pytorch

General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends)

(to be released) [NeurIPS'21] Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs

null

Sky Computing: Accelerating Geo-distributed Computing in Federated Learning

A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️

Tooling for the Common Objects In 3D dataset.

An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

This is the code repository for the paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (NeurIPS 2021).

A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

Official public repository of paper "Intention Adaptive Graph Neural Network for Category-Aware Session-Based Recommendation"

Simulation code and tutorial for BBHnet training data