This is an official implementation for "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

Overview

DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation

This repo is the official implementation of "DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation". [Paper] [Project]

Update

  • Clean version is released! It currently includes code, data, log and models for the following tasks:
  • 2D human pose estimation
  • 3D human pose estimation
  • Body recovery via a SMPL model

TODO

  • Provide different sample interval checkpoints/logs
  • Add DeciWatch in MMHuman3D

Description

This paper proposes a simple baseline framework for video-based 2D/3D human pose estimation that can achieve 10 times efficiency improvement over existing works without any performance degradation, named DeciWatch. Unlike current solutions that estimate each frame in a video, DeciWatch introduces a simple yet effective sample-denoise-recover framework that only watches sparsely sampled frames, taking advantage of the continuity of human motions and the lightweight pose representation. Specifically, DeciWatch uniformly samples less than 10% video frames for detailed estimation, denoises the estimated 2D/3D poses with an efficient Transformer architecture, and then accurately recovers the rest of the frames using another Transformer-based network. Comprehensive experimental results on three video-based human pose estimation, body mesh recovery tasks and efficient labeling in videos with four datasets validate the efficiency and effectiveness of DeciWatch.

Getting Started

Environment Requirement

DeciWatch has been implemented and tested on Pytorch 1.10.1 with python >= 3.6. It supports both GPU and CPU inference.

Clone the repo:

git clone https://github.com/cure-lab/DeciWatch.git

We recommend you install the requirements using conda:

# conda
source scripts/install_conda.sh

Prepare Data

All the data used in our experiment can be downloaded here.

Google Drive

Baidu Netdisk

Valid data includes:

Dataset Pose Estimator 3D Pose 2D Pose SMPL
Sub-JHMDB SimplePose
3DPW EFT
3DPW PARE
3DPW SPIN
Human3.6M FCN
AIST++ SPIN

Please refer to doc/data.md for detailed data information and data preparing.

Training

Run the commands below to start training:

python train.py --cfg [config file] --dataset_name [dataset name] --estimator [backbone estimator you use] --body_representation [smpl/3D/2D] --sample_interval [sample interval N]

For example, you can train on 3D representation of 3DPW using backbone estimator SPIN with sample interval 10 by:

python train.py --cfg configs/config_pw3d_spin.yaml --dataset_name pw3d --estimator spin --body_representation 3D --sample_interval 10

Note that the training and testing datasets should be downloaded and prepared before training.

You may refer to doc/training.md for more training details.

Evaluation

Results on 2D Pose

Dataset Estimator PCK 0.05 (INPUT/OUTPUT) PCK 0.1 (INPUT/OUTPUT) PCK 0.2 (INPUT/OUTPUT) Download
Sub-JHMDB simplepose 57.30%/79.32% 81.61%/94.27% 93.94%/98.85% Baidu Netdisk / Google Drive

Results on 3D Pose

Dataset Estimator MPJPE (INPUT/OUTPUT) Accel (INPUT/OUTPUT) Download
3DPW SPIN 96.92/93.34 34.68/7.06 Baidu Netdisk / Google Drive
3DPW EFT 90.34/89.02 32.83/6.84 Baidu Netdisk / Google Drive
3DPW PARE 78.98/77.16 25.75/6.90 Baidu Netdisk / Google Drive
AIST++ SPIN 107.26/71.27 33.37/5.68 Baidu Netdisk / Google Drive
Human3.6M FCN 54.56/52.83 19.18/1.47 Baidu Netdisk / Google Drive

Results on SMPL

Dataset Estimator MPJPE (INPUT/OUTPUT) Accel (INPUT/OUTPUT) MPVPE (INPUT/OUTPUT) Download
3DPW SPIN 100.13/97.53 35.53/8.38 114.39/112.84 Baidu Netdisk / Google Drive
3DPW EFT 91.60/92.56 33.57/8.7 5 110.34/109.27 Baidu Netdisk / Google Drive
3DPW PARE 80.44/81.76 26.77/7.24 94.88/95.68 Baidu Netdisk / Google Drive
AIST++ SPIN 108.25/82.10 33.83/7.27 137.51/106.08 Baidu Netdisk / Google Drive

Noted that although our main contribution is the efficiency improvement, using DeciWatch as post processing is also helpful for accuracy and smoothness improvement.

You may refer to doc/evaluate.md for evaluate details.

Quick Demo

Run the commands below to visualize demo:

python demo.py --cfg [config file] --dataset_name [dataset name] --estimator [backbone estimator you use] --body_representation [smpl/3D/2D] --sample_interval [sample interval N]

You are supposed to put corresponding images with the data structure:

|-- data
    |-- videos
        |-- pw3d 
            |-- downtown_enterShop_00
                |-- image_00000.jpg
                |-- ...
            |-- ...
        |-- jhmdb
            |-- catch
            |-- ...
        |-- aist
            |-- gWA_sFM_c01_d27_mWA2_ch21.mp4
            |-- ...
        |-- ...

For example, you can train on 3D representation of 3DPW using backbone estimator SPIN with sample interval 10 by:

python demo.py --cfg configs/config_pw3d_spin.yaml --dataset_name pw3d --estimator spin --body_representation 3D --sample_interval 10

Please refer to the dataset website for the raw images. You may change the config in lib/core/config.py for different visualization parameters.

You may refer to doc/visualize.md for visualization details.

Citing DeciWatch

If you find this repository useful for your work, please consider citing it as follows:

@article{zeng2022deciwatch,
  title={DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation},
  author={Zeng, Ailing and Ju, Xuan and Yang, Lei and Gao, Ruiyuan and Zhu, Xizhou and Dai, Bo and Xu, Qiang},
  journal={arXiv preprint arXiv:2203.08713},
  year={2022}
}

Please remember to cite all the datasets and backbone estimators if you use them in your experiments.

Acknowledgement

Many thanks to Xuan Ju for her great efforts to clean almost the original code!!!

License

This code is available for non-commercial scientific research purposes as defined in the LICENSE file. By downloading and using this code you agree to the terms in the LICENSE. Third-party datasets and software are subject to their respective licenses.

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

68 Dec 14, 2022
This is a demo app to be used in the video streaming applications

MoViDNN: A Mobile Platform for Evaluating Video Quality Enhancement with Deep Neural Networks MoViDNN is an Android application that can be used to ev

ATHENA Christian Doppler (CD) Laboratory 7 Jul 21, 2022
HackBMU-5.0-Team-Ctrl-Alt-Elite - HackBMU 5.0 Team Ctrl Alt Elite

HackBMU-5.0-Team-Ctrl-Alt-Elite The search is over. We present to you ‘Health-A-

3 Feb 19, 2022
Convert BART models to ONNX with quantization. 3X reduction in size, and upto 3X boost in inference speed

fast-Bart Reduction of BART model size by 3X, and boost in inference speed up to 3X BART implementation of the fastT5 library (https://github.com/Ki6a

Siddharth Sharma 19 Dec 09, 2022
Implementation for the "Surface Reconstruction from 3D Line Segments" paper.

Surface Reconstruction from 3D Line Segments Surface reconstruction from 3d line segments. Langlois, P. A., Boulch, A., & Marlet, R. In 2019 Internati

85 Jan 04, 2023
Python scripts for performing stereo depth estimation using the HITNET Tensorflow model.

HITNET-Stereo-Depth-estimation Python scripts for performing stereo depth estimation using the HITNET Tensorflow model from Google Research. Stereo de

Ibai Gorordo 76 Jan 02, 2023
Implementation of our NeurIPS 2021 paper "A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs".

PPO-BiHyb This is the official implementation of our NeurIPS 2021 paper "A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Grap

<a href=[email protected]"> 66 Nov 23, 2022
Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Ng Kam Woh 71 Dec 22, 2022
An efficient PyTorch library for Global Wheat Detection using YOLOv5. The project is based on this Kaggle competition Global Wheat Detection (2021).

Global-Wheat-Detection An efficient PyTorch library for Global Wheat Detection using YOLOv5. The project is based on this Kaggle competition Global Wh

Chuxin Wang 11 Sep 25, 2022
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

102 Dec 25, 2022
Codes for [NeurIPS'21] You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership.

You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership Codes for [NeurIPS'21] You are caught stealing my winni

VITA 8 Nov 01, 2022
A repo with study material, exercises, examples, etc for Devnet SPAUTO

MPLS in the SDN Era -- DevNet SPAUTO Get right to the study material: Checkout the Wiki! A lab topology based on MPLS in the SDN era book used for 30

Hugo Tinoco 67 Nov 16, 2022
Code for: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification Prerequisite PyTorch = 1.2.0 Python3 torch

16 Dec 14, 2022
GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!

GEP (GDB Enhanced Prompt) GEP (GDB Enhanced Prompt) is a GDB plug-in which make your GDB command prompt more convenient and flexibility. Why I need th

Alan Li 23 Dec 21, 2022
Setup and customize deep learning environment in seconds.

Deepo is a series of Docker images that allows you to quickly set up your deep learning research environment supports almost all commonly used deep le

Ming 6.3k Jan 06, 2023
A very impractical 3D rendering engine that runs in the python terminal.

Terminal-3D-Render A very impractical 3D rendering engine that runs in the python terminal. do NOT try to run this program using the standard python I

23 Dec 31, 2022
Learning Skeletal Articulations with Neural Blend Shapes

This repository provides an end-to-end library for automatic character rigging and blend shapes generation as well as a visualization tool. It is based on our work Learning Skeletal Articulations wit

Peizhuo 504 Dec 30, 2022
This repository is dedicated to developing and maintaining code for experiments with wide neural networks.

Wide-Networks This repository contains the code of various experiments on wide neural networks. In particular, we implement classes for abc-parameteri

Karl Hajjar 0 Nov 02, 2021
Prevent `CUDA error: out of memory` in just 1 line of code.

🐨 Koila Koila solves CUDA error: out of memory error painlessly. Fix it with just one line of code, and forget it. 🚀 Features 🙅 Prevents CUDA error

RenChu Wang 1.7k Jan 02, 2023
Implementation of algorithms for continuous control (DDPG and NAF).

DEPRECATION This repository is deprecated and is no longer maintaned. Please see a more recent implementation of RL for continuous control at jax-sac.

Ilya Kostrikov 288 Dec 31, 2022