We are More than Our JOints: Predicting How 3D Bodies Move

Overview

We are More than Our JOints: Predicting How 3D Bodies Move

Citation

This repo contains the official implementation of our paper MOJO:

@inproceedings{Zhang:CVPR:2021,
  title = {We are More than Our Joints: Predicting how {3D} Bodies Move},
  author = {Zhang, Yan and Black, Michael J. and Tang, Siyu},
  booktitle = {Proceedings IEEE/CVF Conf.~on Computer Vision and Pattern Recognition (CVPR)},
  month = jun,
  year = {2021},
  month_numeric = {6}
}

License

We employ CC BY-NC-SA 4.0 for the MOJO code, which covers

models/fittingop.py
experiments/utils/batch_gen_amass.py
experiments/utils/utils_canonicalize_amass.py
experiments/utils/utils_fitting_jts2mesh.py
experiments/utils/vislib.py
experiments/vis_*_amass.py

The rest part are developed based on DLow. According to their license, the implementation follows its CMU license.

Environment & code structure

  • Tested OS: Linux Ubuntu 18.04
  • Packages:
  • Note: All scripts should be run from the root of this repo to avoid path issues. Also, please fix some path configs in the code, otherwise errors will occur.

Training

The training is split to two steps. Provided we have a config file in experiments/cfg/amass_mojo_f9_nsamp50.yml, we can do

  • python experiments/train_MOJO_vae.py --cfg amass_mojo_f9_nsamp50 to train the MOJO
  • python experiments/train_MOJO_dlow.py --cfg amass_mojo_f9_nsamp50 to train the DLow

Evaluation

These experiments/eval_*.py files are for evaluation. For eval_*_pred.py, they can be used either to evaluate the results while predicting, or to save results to a file for further evaluation and visualization. An example is python experiments/eval_kps_pred.py --cfg amass_mojo_f9_nsamp50 --mode vis, which is to save files to the folder results/amass_mojo_f9_nsamp50.

Generation

In MOJO, the recursive projection scheme is to get 3D bodies from markers and keep the body valid. The relevant implementation is mainly in models/fittingop.py and experiments/test_recursive_proj.py. An example to run is

python experiments/test_recursive_proj.py --cfg amass_mojo_f9_nsamp50 --testdata ACCAD --gpu_index 0

Datasets

In MOJO, we have used AMASS, Human3.6M, and HumanEva.

For Human3.6M and HumanEva, we follow the same pre-processing step as in DLow, VideoPose3D, and others. Please refer to their pages, e.g. this one, for details.

For AMASS, we perform canonicalization of motion sequences with our own procedures. The details are in experiments/utils_canonicalize_amass.py. We find this sequence canonicalization procedure is important. The canonicalized AMASS used in our work can be downloaded here, which includes the random sample names of ACCAD and BMLhandball used in our experiments about motion realism.

Models

For human body modeling, we employ the SMPL-X parametric body model. You need to follow their license and download. Based on SMPL-X, we can use the body joints or a sparse set of body mesh vertices (the body markers) to represent the body.

  • CMU It has 41 markers, the corresponding SMPL-X mesh vertex ID can be downloaded here.
  • SSM2 It has 64 markers, the corresponding SMPL-X mesh vertex ID can be downloaded here.
  • Joints We used 22 joints. No need to download, but just obtain them from the SMPL-X body model. See details in the code.

Our CVAE model configurations are in experiments/cfg. The pre-trained checkpoints can be downloaded here.

Related projects

  • AMASS: It unifies diverse motion capture data with the SMPL-H model, and provides a large-scale high-quality dataset. Its official codebase and tutorials are in this github repo.

  • GRAB: Most mocap data only contains the body motion. GRAB, however, provides high-quality data of human-object interactions. Besides capturing the body motion, the object motion and the hand-object contact are captured simultaneously. More demonstrations are in its github repo.

Acknowledgement & disclaimer

We thank Nima Ghorbani for the advice on the body marker setting and the {\bf AMASS} dataset. We thank Yinghao Huang, Cornelia K"{o}hler, Victoria Fern'{a}ndez Abrevaya, and Qianli Ma for proofreading. We thank Xinchen Yan and Ye Yuan for discussions on baseline methods. We thank Shaofei Wang and Siwei Zhang for their help with the user study and the presentation, respectively.

MJB has received research gift funds from Adobe, Intel, Nvidia, Facebook, and Amazon. While MJB is a part-time employee of Amazon, his research was performed solely at, and funded solely by, Max Planck. MJB has financial interests in Amazon Datagen Technologies, and Meshcapade GmbH.

Qt-GUI implementation of the YOLOv5 algorithm (ver.6 and ver.5)

YOLOv5-GUI 🎉 YOLOv5算法(ver.6及ver.5)的Qt-GUI实现 🎉 Qt-GUI implementation of the YOLOv5 algorithm (ver.6 and ver.5). 基于YOLOv5的v5版本和v6版本及Javacr大佬的UI逻辑进行编写

EricFang 12 Dec 28, 2022
Download and preprocess popular sequential recommendation datasets

Sequential Recommendation Datasets This repository collects some commonly used sequential recommendation datasets in recent research papers and provid

125 Dec 06, 2022
PyTorch implementation of spectral graph ConvNets, NIPS’16

Graph ConvNets in PyTorch October 15, 2017 Xavier Bresson http://www.ntu.edu.sg/home/xbresson https://github.com/xbresson https://twitter.com/xbresson

Xavier Bresson 287 Jan 04, 2023
This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Data Structure and Algorithms with Python This repository is related to the Arabic tutorial here, within the tutorial we discuss the common data struc

Mohamed Ayman 33 Dec 02, 2022
A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

SlowFast A PyTorch implementation of SlowFast based on ICCV 2019 paper SlowFast Networks for Video Recognition. Requirements Anaconda PyTorch conda in

Hao Ren 8 Dec 23, 2022
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

880 Jan 07, 2023
Learning Chinese Character style with conditional GAN

zi2zi: Master Chinese Calligraphy with Conditional Adversarial Networks Introduction Learning eastern asian language typefaces with GAN. zi2zi(字到字, me

Yuchen Tian 2.2k Jan 02, 2023
Source code and data in paper "MDFEND: Multi-domain Fake News Detection (CIKM'21)"

MDFEND: Multi-domain Fake News Detection This is an official implementation for MDFEND: Multi-domain Fake News Detection which has been accepted by CI

Rich 40 Dec 18, 2022
TensorFlow (v2.7.0) benchmark results on an M1 Macbook Air 2020 laptop (macOS Monterey v12.1).

M1-tensorflow-benchmark TensorFlow (v2.7.0) benchmark results on an M1 Macbook Air 2020 laptop (macOS Monterey v12.1). I was initially testing if Tens

particle 2 Jan 05, 2022
A vanilla 3D face modeling on pose-invariant and multi-lightning image data

3D-Face-Modeling A vanilla 3D face modeling on pose-invariant and multi-lightning image data Table of Contents Background Install Usage Contributing B

Haochen Zhang 1 Mar 12, 2022
Computer Vision application in the web

Computer Vision application in the web Preview Usage Clone this repo git clone https://github.com/amineHY/WebApp-Computer-Vision-streamlit.git cd Web

Amine Hadj-Youcef. PhD 35 Dec 06, 2022
A Research-oriented Federated Learning Library and Benchmark Platform for Graph Neural Networks. Accepted to ICLR'2021 - DPML and MLSys'21 - GNNSys workshops.

FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks A Research-oriented Federated Learning Library and Benchmark Platform

FedML-AI 175 Dec 01, 2022
[ICCV 2021 Oral] Deep Evidential Action Recognition

DEAR (Deep Evidential Action Recognition) Project | Paper & Supp Wentao Bao, Qi Yu, Yu Kong International Conference on Computer Vision (ICCV Oral), 2

Wentao Bao 80 Jan 03, 2023
PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-supervised ViT.

MAE for Self-supervised ViT Introduction This is an unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-sup

36 Oct 30, 2022
Mixed Transformer UNet for Medical Image Segmentation

MT-UNet Update 2022/01/05 By another round of training based on previous weights, our model also achieved a better performance on ACDC (91.61% DSC). W

dotman 92 Dec 25, 2022
Official Code for "Non-deep Networks"

Non-deep Networks arXiv:2110.07641 Ankit Goyal, Alexey Bochkovskiy, Jia Deng, Vladlen Koltun Overview: Depth is the hallmark of DNNs. But more depth m

Ankit Goyal 567 Dec 12, 2022
Python implementation of O-OFDMNet, a deep learning-based optical OFDM system,

O-OFDMNet This includes Python implementation of O-OFDMNet, a deep learning-based optical OFDM system, which uses neural networks for signal processin

Thien Luong 4 Sep 09, 2022
Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds."

DeltaConv [Paper] [Project page] Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds" by Ru

98 Nov 26, 2022
ADGAN - The Implementation of paper Controllable Person Image Synthesis with Attribute-Decomposed GAN

ADGAN - The Implementation of paper Controllable Person Image Synthesis with Attribute-Decomposed GAN CVPR 2020 (Oral); Pose and Appearance Attributes Transfer;

Men Yifang 400 Dec 29, 2022
Offical implementation for "Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation".

Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation (NeurIPS 2021) by Qiming Hu, Xiaojie Guo. Dependencies P

Qiming Hu 31 Dec 20, 2022