Motion Reconstruction Code and Data for Skills from Videos (SFV)

Last update: Dec 01, 2022

Related tags

Overview

Motion Reconstruction Code and Data for Skills from Videos (SFV)

This repo contains the data and the code for motion reconstruction component of the SFV paper:

SFV: Reinforcement Learning of Physical Skills from Videos
Transactions on Graphics (Proc. ACM SIGGRAPH Asia 2018)
Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, Sergey Levine
University of California, Berkeley

Project Page

Data

The data for the video can be found in this link.
It contains the:

Input videos
Intermediate 2D OpenPose, tracks, and HMR outputs
Result video of before and after motion reconstruction
Output of motion reconstruction in bvh used to train the character

See the README in the tar file for more details.

Requirements

TensorFlow
SMPL
Have the same models/ structure as in HMR (you need the trained models and neutral_smpl_with_cocoplus_reg.pkl)

Rotation augmented models

This repo uses fine-tuned models for OpenPose and HMR with rotation augmentation. The models used can be found here: ft-OpenPose, ft-HMR

Steps to run:

python -m run_openpose
python -m refine_video

I recommend starting with the preprocessed data that's packaged with the above link, and start from python -m refine_video. Then run step 1 for your own video.

Comments

Note this repo is more of a research code demo compared to my other project code releases. It's also slightly dated. I'm putting this out there in case this is useful for others. You may need to fix some quirks.

Pull requests/contributions welcome!

License

This particular repo is under BSD but please follow the license agreement for tools that I build on such as SMPL and OpenPose.

June 28 2019.

In this repo, motion reconstruction smoothes HMR output. We recently released the demo for Human Mesh and Motion Recovery (HMMR), which will give you smoother outputs. You can apply motion reconstrution on top of the HMMR outputs, which will be a better starting point. This would probably be the best combination of the tools out there today.

I'm also using 2D pose from OpenPose here and have my own hacky tracking code. However there are more recent tools such as AlphaPose and PoseFlow that will compute the tracklet for you. (We use this in the HMMR codebase).

Fitting the HMMR output to DensePose output will be another simple loss function to add to the motion reconstruction to get a good 3D body fit to a video.

All of these would be a good starter project ;)

Another practical improvements that should be made is that this uses OpenDR renderer to render the results, which is slow and takes up most of the run time. In HMMR we use (the pytorch NMR)[https://github.com/daniilidis-group/neural_renderer] to render the results. The same logic can be adapted here.

Citation

If you use this code for your research, please consider citing:

@article{
	2018-TOG-SFV,
	author = {Peng, Xue Bin and Kanazawa, Angjoo and Malik, Jitendra and Abbeel, Pieter and Levine, Sergey},
	title = {SFV: Reinforcement Learning of Physical Skills from Videos},
	journal = {ACM Trans. Graph.},
	volume = {37},
	number = {6},
	month = nov,
	year = {2018},
	articleno = {178},
	numpages = {14},
	publisher = {ACM},
	address = {New York, NY, USA},
	keywords = {physics-based character animation, computer vision, video imitation, reinforcement learning, motion reconstruction}
} 
@inProceedings{kanazawaHMR18,
  title={End-to-end Recovery of Human Shape and Pose},
  author = {Angjoo Kanazawa
  and Michael J. Black
  and David W. Jacobs
  and Jitendra Malik},
  booktitle={Computer Vision and Pattern Regognition (CVPR)},
  year={2018}
}

Motion Reconstruction Code and Data for Skills from Videos (SFV)

Related tags

Overview

Motion Reconstruction Code and Data for Skills from Videos (SFV)

Data

Requirements

Rotation augmented models

Steps to run:

Comments

License

June 28 2019.

Citation

Owner

Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

Container : Context Aggregation Network

Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)

Contrastive Multi-View Representation Learning on Graphs

Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation.

code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances paper.

Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

Code and dataset for ACL2018 paper "Exploiting Document Knowledge for Aspect-level Sentiment Classification"

RoMa: A lightweight library to deal with 3D rotations in PyTorch.

Spectralformer: Rethinking hyperspectral image classification with transformers

Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.

Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

Object Database for Super Mario Galaxy 1/2.

Implementation of Ag-Grid component for Streamlit

On the adaptation of recurrent neural networks for system identification

Rasterize with the least efforts for researchers.

A hybrid SOTA solution of LiDAR panoptic segmentation with C++ implementations of point cloud clustering algorithms. ICCV21, Workshop on Traditional Computer Vision in the Age of Deep Learning

Motion Reconstruction Code and Data for Skills from Videos (SFV)

Related tags

Overview

Motion Reconstruction Code and Data for Skills from Videos (SFV)

Data

Requirements

Rotation augmented models

Steps to run:

Comments

License

June 28 2019.

Citation

Owner

Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

Container : Context Aggregation Network

Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)

Contrastive Multi-View Representation Learning on Graphs

Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation.

code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for *Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances* paper.

Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

Code and dataset for ACL2018 paper "Exploiting Document Knowledge for Aspect-level Sentiment Classification"

RoMa: A lightweight library to deal with 3D rotations in PyTorch.

Spectralformer: Rethinking hyperspectral image classification with transformers

Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.

Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

Object Database for Super Mario Galaxy 1/2.

Implementation of Ag-Grid component for Streamlit

On the adaptation of recurrent neural networks for system identification

Rasterize with the least efforts for researchers.

A hybrid SOTA solution of LiDAR panoptic segmentation with C++ implementations of point cloud clustering algorithms. ICCV21, Workshop on Traditional Computer Vision in the Age of Deep Learning

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances paper.