3D Human Pose Machines with Self-supervised Learning

Overview

3D Human Pose Machines with Self-supervised Learning

Keze Wang, Liang Lin, Chenhan Jiang, Chen Qian, and Pengxu Wei, “3D Human Pose Machines with Self-supervised Learning”. To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2019.

This repository implements a 3D human pose machine to resolve 3D pose sequence generation for monocular frames, and includes a concise self-supervised correction mechanism to enhance our model by retaining the 3D geometric consistency. The main part is written in C++ and powered by Caffe deep learning toolbox. Another is written in Python and powered by Tensorflow.

Results

We proposed results on the Human3.6M, KTH Football II and MPII dataset.

   

   

   

License

This project is Only released for Academic Research Use.

Get Started

Clone the repo:

git clone https://github.com/chanyn/3Dpose_ssl.git

or directly download from https://www.dropbox.com/s/qycpjinof2ishw9/3Dpose_ssl.tar.gz?dl=0 (including datasets and well-compiled caffe under cuda-8.0)

Our code is organized as follows:

caffe-3dssl/: support caffe
models/: pretrained models and results
prototxt/: network architecture definitions
tensorflow/: code for online refine 
test/: script that run results split by action 
tools/: python and matlab code 

Requirements

  1. NVIDIA GPU and cuDNN are required to have fast speeds. For now, CUDA 8.0 with cuDNN 5.1 has been tested. The other versions should be working.
  2. Caffe Python wrapper is required.
  3. Tensorflow 1.1.0
  4. python 2.7.13
  5. MATLAB
  6. Opencv-python

Installation

  1. Build 3Dssl Caffe

       cd $ROOT/caffe-3dssl    # Follow the Caffe installation instructions here:    #   http://caffe.berkeleyvision.org/installation.html        # If you're experienced with Caffe and have all of the requirements installed    # and your Makefile.config in place, then simply do:    make all -j 8        make pycaffe    

  1. Install Tensorflow

Datasets

  • Human3.6m

  We change annotation of Human3.6m to hold 16 points ( 'RFoot' 'RKnee' 'RHip' 'LHip' 'LKnee' 'LFoot' 'Hip' 'Spine' 'Thorax' 'Head' 'RWrist' 'RElbow'  'RShoulder' 'LShoulder' 'LElbow' 'LWrist') in keeping with MPII.

  We have provided count mean file and protocol #I & protocol #III split list of Human3.6m. Follow Human3.6m website to download videos and API. We split each video per 5 frames, you can directly download processed square data in this link.  And list format of 16skel_train/test_* is [img_path] [P12dx, P12dy, P22dx, P22dy,..., P13dx, P13dy, P13dz, P23dx, P23dy, P23dz,...] clip. Clip = 0 denote reset lstm.

  shell   # files construction   h36m   |_gt # 2d and 3d annotations splited by actions   |_hg2dh36m # 2d estimation predicted by *Hourglass*, 'square' denotes prediction of square image.   |_ours_2d # 2d prediction from our model   |_ours_3d # 3d coarse prediction of *Model Extension: mask3d*   |_16skel_train_2d3d_clip.txt # train list of *Protocol I*   |_16skel_test_2d3d_clip.txt   |_16skel_train_2d3d_p3_clip.txt # train list of *Protocol III*   |_16skel_test_2d3d_p3_clip.txt   |_16point_mean_limb_scaled_max_min.csv #16 points normalize by (x-min) / (max-min)  

  After setting up Human3.6m dataset following its illustration and download the above training/testing list. You should update “root_folder” paths in CAFFE_ROOT/examples/.../*.prototxt for images and annotation director.

  • MPII

  We crop and square single person from  all images and update 2d annotation in train_h36m.txt (resort points according to order of Human3.6m points).

    mkdir data/MPII   cd data/MPII   wget -v https://drive.google.com/open?id=16gQJvf4wHLEconStLOh5Y7EzcnBUhoM-   tar -xzvf MPII_square.tar.gz   rm -f MPII_square.tar.gz  

 

Training

Offline Phase

Our model consists of two cascade modules, so the training phase can be divided into the following steps:

cd CAFFE_ROOT
  1. Pre-train the 2D pose sub-network with MPII. You can follow CPM or Hourglass or other 2D pose estimation method. We provide pretrained CPM-caffemodel. Please put it into CAFFE_ROOT/models/.

  2. Train 2D-to-3D pose transformer module with Human3.6M. And we fix the parameters of the 2D pose sub-network. The corresponding prototxt file is in examples/2D_to_3D/bilstm.prototxt.

       sh examples/2D_to_3D/train.sh    

  1. To train 3D-to-2D pose projector module, we fix the above module weights. And we need in the wild 2D Pose dataset to help training (we choose MPII).

   sh    sh examples/3D_to_2D/train.sh    

  1. Fine-tune the whole model jointly. We provide trained model and coarse prediction of Protocol I and Protocol III.

   sh    sh examples/finetune_whole/train.sh    

  1. Model extension: Add rand mask to relieve model bias. We provide corresponding model files in examples/mask3d.

   sh    sh examples/mask3d/train.sh    

Model Inference

3D-to-2D project module is initialized from the well-trained model, and they will be updated by minimizing the difference between the predicted 2D pose and projected 2D pose.

  shell   # Step1: Download the trained model   cd PROJECT_ROOT   mkdir models   cd models   wget -v https://drive.google.com/open?id=1dMuPuD_JdHuMIMapwE2DwgJ2IGK04xhQ   unzip model_extension_mask3d.zip   rm -r model_extension_mask3d.zip   cd ../     # Step2: save coarse 3D prediction   cd test   # change 'data_root' in test_human16.sh   # change 'root_folder' in template_16_merge.prototxt   # test_human16.sh [$1 deploy.prototxt] [$2 trained model] [$3 save dir] [$4 batchsize]   sh test_human16.sh . ../models/model_extension_mask3d/mask3d_iter_400000.caffemodel mask3d 5     # Step3: online refine 3D pose prediction   # protocal: 1/3 , default is 1   # pose2d: ours/hourglass/gt, default is ours   # coarse_3d: saved results in Sept2   python pred_v2.py --trained_model ../models/model_extension_mask3d/mask3d-400000.pkl --protocol 1 --data_dir /data/h36m/ --coarse_3d ../test/mask3d --save srr_results --pose2d hourglass  

 

  shell   # Maybe you want to predict 2d.   # The model we use to predict 2d pose is similar to our 3dpredict model without ssl module.   # Or you can use Hourglass(https://github.com/princeton-vl/pose-hg-demo) to predict 2d pose     # Step1.1: Download the trained merge model   cd PROJECT_ROOT   mkdir models && cd models   wget -v https://drive.google.com/open?id=19kTyttzUnm_1_7HEwoNKCXPP2QVo_zcK   unzip our2d.zip   rm -r our2d.zip   # move 2d prototxt to PROJECT_ROOT/test/   mv our2d/2d ../test/   cd ../     # Step1.2: save 2D prediction   cd test   # change 'data_root' in test_human16.sh   # change 'root_folder' in 2d/template_16_merge.prototxt   # test_human16.sh [$1 deploy.prototxt] [$2 trained model] [$3 save dir] [$4 batchsize]   sh test_human16.sh 2d/ ../models/our2d/2d_iter_800000.caffemodel our2d 5   # replace predict 2d pose in data dir or change data_dir in tensorflow/pred_v2.py   mv our2d /data/h36m/ours_2d/bilstm2d-p1-800000       # Step2 is same as above       # Step3: online refine 3D pose prediction   # protocal: 1/3 , default is 1   # pose2d: ours/hourglass/gt, default is ours   # coarse_3d: saved results in Sept2   python pred_v2.py --trained_model ../models/model_extension_mask3d/mask3d-400000.pkl --protocol 1 --data_dir /data/h36m/ --coarse_3d ../test/mask3d --save srr_results --pose2d ours  

 

  • Inference with yourself

  The only difference is that you should transfer caffemodel of 3D-to-2D project module to pkl file. We provide gen_refinepkl.py in tools/.

  sh   # Follow above Step1~2 to produce coarse 3d prediction and 2d pose.   # transfer caffemodel of SRR module to python .pkl file   python tools/gen_refinepkl.py CAFFE_ROOT CAFFEMODEL_DIR --pkl_dir model.pkl     # online refine 3D pose prediction   python pred_v2.py --trained_model model.pkl  

 

  • Evaluation

  shell   # Print MPJP   run tools/eval_h36m.m     # Visualization of 2dpose/ 3d gt pose/ 3d coarse pose/ 3d refine pose   # Please change data_root in visualization.m before running   run visualization.m  

Citation

@article{wang20193d,
  title={3D Human Pose Machines with Self-supervised Learning},
  author={Wang, Keze and Lin, Liang and Jiang, Chenhan and Qian, Chen and Wei, Pengxu},
  journal={IEEE transactions on pattern analysis and machine intelligence},
  year={2019},
  publisher={IEEE}
}
Owner
Chenhan Jiang
Chenhan Jiang
Simple tools for logging and visualizing, loading and training

TNT TNT is a library providing powerful dataloading, logging and visualization utilities for Python. It is closely integrated with PyTorch and is desi

1.5k Jan 02, 2023
This repository contains code for the paper "Decoupling Representation and Classifier for Long-Tailed Recognition", published at ICLR 2020

Classifier-Balancing This repository contains code for the paper: Decoupling Representation and Classifier for Long-Tailed Recognition Bingyi Kang, Sa

Facebook Research 820 Dec 26, 2022
PyTorch implementation of our paper How robust are discriminatively trained zero-shot learning models?

How robust are discriminatively trained zero-shot learning models? This repository contains the PyTorch implementation of our paper How robust are dis

Mehmet Kerim Yucel 5 Feb 04, 2022
This is the official implementation of TrivialAugment and a mini-library for the application of multiple image augmentation strategies including RandAugment and TrivialAugment.

Trivial Augment This is the official implementation of TrivialAugment (https://arxiv.org/abs/2103.10158), as was used for the paper. TrivialAugment is

AutoML-Freiburg-Hannover 94 Dec 30, 2022
Creating a Linear Program Solver by Implementing the Simplex Method in Python with NumPy

Creating a Linear Program Solver by Implementing the Simplex Method in Python with NumPy Simplex Algorithm is a popular algorithm for linear programmi

Reda BELHAJ 2 Oct 12, 2022
Official implementation of VQ-Diffusion

Vector Quantized Diffusion Model for Text-to-Image Synthesis Overview This is the official repo for the paper: [Vector Quantized Diffusion Model for T

Microsoft 592 Jan 03, 2023
PyTorch implementation of MLP-Mixer

PyTorch implementation of MLP-Mixer MLP-Mixer: an all-MLP architecture composed of alternate token-mixing and channel-mixing operations. The token-mix

Duo Li 33 Nov 27, 2022
Circuit Training: An open-source framework for generating chip floor plans with distributed deep reinforcement learning

Circuit Training: An open-source framework for generating chip floor plans with distributed deep reinforcement learning. Circuit Training is an open-s

Google Research 479 Dec 25, 2022
Autonomous Robots Kalman Filters

Autonomous Robots Kalman Filters The Kalman Filter is an easy topic. However, ma

20 Jul 18, 2022
Stochastic Extragradient: General Analysis and Improved Rates

Stochastic Extragradient: General Analysis and Improved Rates This repository is the official implementation of the paper "Stochastic Extragradient: G

Hugo Berard 4 Nov 11, 2022
Pytorch implementation for "Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion" (NeurIPS 2021)

Density-aware Chamfer Distance This repository contains the official PyTorch implementation of our paper: Density-aware Chamfer Distance as a Comprehe

Tong WU 93 Dec 15, 2022
Attendance Monitoring with Face Recognition using Python

Attendance Monitoring with Face Recognition using Python A python GUI integrated attendance system using face recognition to take attendance. In this

Vaibhav Rajput 2 Jun 21, 2022
Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

TXT 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Txt2Xml too

Nguyễn Trường Lâu 4 Nov 24, 2022
Analyzes your GitHub Profile and presents you with a report on how likely you are to become the next MLH Fellow!

Fellowship Prediction GitHub Profile Comparative Analysis Tool Built with BentoML Table of Contents: Features Disclaimer Technologies Used Contributin

Damir Temir 51 Dec 29, 2022
Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh

Arjun Majumdar 44 Dec 14, 2022
Resources for our AAAI 2022 paper: "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification".

LOREN Resources for our AAAI 2022 paper (pre-print): "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification". DEMO System Check out o

Jiangjie Chen 37 Dec 27, 2022
Confident Semantic Ranking Loss for Part Parsing

Confident Semantic Ranking Loss for Part Parsing

Jiachen Xu 5 Oct 22, 2022
[ICCV 2021 Oral] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

This repository contains the source code for the paper SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer (ICCV 2021 Oral). The project page is here.

AllenXiang 65 Dec 26, 2022
Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective

Does-MAML-Only-Work-via-Feature-Re-use-A-Data-Set-Centric-Perspective Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective Installin

2 Nov 07, 2022
Customer Segmentation using RFM

Customer-Segmentation-using-RFM İş Problemi Bir e-ticaret şirketi müşterilerini segmentlere ayırıp bu segmentlere göre pazarlama stratejileri belirlem

Nazli Sener 7 Dec 26, 2021