PyTorch implementation for 3D human pose estimation

Overview

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach

This repository is the PyTorch implementation for the network presented in:

Xingyi Zhou, Qixing Huang, Xiao Sun, Xiangyang Xue, Yichen Wei, Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach ICCV 2017 (arXiv:1704.02447)

Note: This repository has been updated and is different from the method discribed in the paper. To fully reproduce the results in the paper, please checkout the original torch implementation or our pytorch re-implementation branch (slightly worse than torch). We also provide a clean 2D hourglass network branch.

The updates include:

  • Change network backbone to ResNet50 with deconvolution layers (Xiao et al. ECCV2018). Training is now about 3x faster than the original hourglass net backbone (but no significant performance improvement).
  • Change the depth regression sub-network to a one-layer depth map (described in our StarMap project).
  • Change the Human3.6M dataset to official release in ECCV18 challenge.
  • Update from python 2.7 and pytorch 0.1.12 to python 3.6 and pytorch 0.4.1.

Contact: [email protected]

Installation

The code was tested with Anaconda Python 3.6 and PyTorch v0.4.1. After install Anaconda and Pytorch:

  1. Clone the repo:

    POSE_ROOT=/path/to/clone/pytorch-pose-hg-3d
    git clone https://github.com/xingyizhou/pytorch-pose-hg-3d POSE_ROOT
    
  2. Install dependencies (opencv, and progressbar):

    conda install --channel https://conda.anaconda.org/menpo opencv
    conda install --channel https://conda.anaconda.org/auto progress
    
  3. Disable cudnn for batch_norm (see issue):

    # PYTORCH=/path/to/pytorch
    # for pytorch v0.4.0
    sed -i "1194s/torch\.backends\.cudnn\.enabled/False/g" ${PYTORCH}/torch/nn/functional.py
    # for pytorch v0.4.1
    sed -i "1254s/torch\.backends\.cudnn\.enabled/False/g" ${PYTORCH}/torch/nn/functional.py
    
  4. Optionally, install tensorboard for visializing training.

    pip install tensorflow
    

Demo

  • Download our pre-trained model and move it to models.
  • Run python demo.py --demo /path/to/image/or/image/folder [--gpus -1] [--load_model /path/to/model].

--gpus -1 is for CPU mode. We provide example images in images/. For testing your own image, it is important that the person should be at the center of the image and most of the body parts should be within the image.

Benchmark Testing

To test our model on Human3.6 dataset run

python main.py --exp_id test --task human3d --dataset fusion_3d --load_model ../models/fusion_3d_var.pth --test --full_test

The expected results should be 64.55mm.

Training

  • Prepare the training data:

    ${POSE_ROOT}
    |-- data
    `-- |-- mpii
        `-- |-- annot
            |   |-- train.json
            |   |-- valid.json
            `-- images
                |-- 000001163.jpg
                |-- 000003072.jpg
    `-- |-- h36m
        `-- |-- ECCV18_Challenge
            |   |-- Train
            |   |-- Val
            `-- msra_cache
                `-- |-- HM36_eccv_challenge_Train_cache
                    |   |-- HM36_eccv_challenge_Train_w288xh384_keypoint_jnt_bbox_db.pkl
                    `-- HM36_eccv_challenge_Val_cache
                        |-- HM36_eccv_challenge_Val_w288xh384_keypoint_jnt_bbox_db.pkl
    
  • Stage1: Train 2D pose only. model, log

python main.py --exp_id mpii
  • Stage2: Train on 2D and 3D data without geometry loss (drop LR at 45 epochs). model, log
python main.py --exp_id fusion_3d --task human3d --dataset fusion_3d --ratio_3d 1 --weight_3d 0.1 --load_model ../exp/mpii/model_last.pth --num_epoch 60 --lr_step 45
  • Stage3: Train with geometry loss. model, log
python main.py --exp_id fusion_3d_var --task human3d --dataset fusion_3d --ratio_3d 1 --weight_3d 0.1 --weight_var 0.01 --load_model ../models/fusion_3d.pth  --num_epoch 10 --lr 1e-4

Citation

@InProceedings{Zhou_2017_ICCV,
author = {Zhou, Xingyi and Huang, Qixing and Sun, Xiao and Xue, Xiangyang and Wei, Yichen},
title = {Towards 3D Human Pose Estimation in the Wild: A Weakly-Supervised Approach},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}
Owner
Xingyi Zhou
CS Ph.D. student at UT Austin.
Xingyi Zhou
Organseg dags - The repository contains the codebase for multi-organ segmentation with directed acyclic graphs (DAGs) in CT.

Organseg dags - The repository contains the codebase for multi-organ segmentation with directed acyclic graphs (DAGs) in CT.

yzf 1 Jun 12, 2022
Making self-supervised learning work on molecules by using their 3D geometry to pre-train GNNs. Implemented in DGL and Pytorch Geometric.

3D Infomax improves GNNs for Molecular Property Prediction Video | Paper We pre-train GNNs to understand the geometry of molecules given only their 2D

Hannes Stärk 95 Dec 30, 2022
Text2Art is an AI art generator powered with VQGAN + CLIP and CLIPDrawer models

Text2Art is an AI art generator powered with VQGAN + CLIP and CLIPDrawer models. You can easily generate all kind of art from drawing, painting, sketch, or even a specific artist style just using a t

Muhammad Fathy Rashad 643 Dec 30, 2022
Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

NAVER/LINE Vision 30 Dec 06, 2022
Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Multi-modal Interaction Graph Convolutioal Network for Temporal Language Localization in Videos Official implementation for Multi-Modal Interaction Gr

Zongmeng Zhang 15 Oct 18, 2022
Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Dual-task Pose Transformer Network The source code for our paper "Exploring Dual-task Correlation for Pose Guided Person Image Generation“ (CVPR2022)

63 Dec 15, 2022
PyTorch Implementation of DSB for Score Based Generative Modeling. Experiments managed using Hydra.

Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling This repository contains the implementation for the paper Diffusion

James Thornton 50 Jan 03, 2023
This is the code used in the paper "Entity Embeddings of Categorical Variables".

This is the code used in the paper "Entity Embeddings of Categorical Variables". If you want to get the original version of the code used for the Kagg

Cheng Guo 845 Nov 29, 2022
Colab notebook and additional materials for Python-driven analysis of redlining data in Philadelphia

RedliningExploration The Google Colaboratory file contained in this repository contains work inspired by a project on educational inequality in the Ph

Benjamin Warren 1 Jan 20, 2022
Library for converting from RGB / GrayScale image to base64 and back.

Library for converting RGB / Grayscale numpy images from to base64 and back. Installation pip install -U image_to_base_64 Conversion RGB to base 64 b

Vladimir Iglovikov 16 Aug 28, 2022
PyTorch Implementation for Fracture Detection in Wrist Bone X-ray Images

wrist-d PyTorch Implementation for Fracture Detection in Wrist Bone X-ray Images note: Paper: Under Review at MPDI Diagnostics Submission Date: Novemb

Fatih UYSAL 5 Oct 12, 2022
Analysis of Antarctica sequencing samples contaminated with SARS-CoV-2

Analysis of SARS-CoV-2 reads in sequencing of 2018-2019 Antarctica samples in PRJNA692319 The samples analyzed here are described in this preprint, wh

Jesse Bloom 4 Feb 09, 2022
A fast and easy to use, moddable, Python based Minecraft server!

PyMine PyMine - The fastest, easiest to use, Python-based Minecraft Server! Features Note: This list is not always up to date, and doesn't contain all

PyMine 144 Dec 30, 2022
Code for the paper "Graph Attention Tracking". (CVPR2021)

SiamGAT 1. Environment setup This code has been tested on Ubuntu 16.04, Python 3.5, Pytorch 1.2.0, CUDA 9.0. Please install related libraries before r

122 Dec 24, 2022
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Pretrained Language Model This repository provides the latest pretrained language models and its related optimization techniques developed by Huawei N

HUAWEI Noah's Ark Lab 2.6k Jan 01, 2023
Knowledge Distillation Toolbox for Semantic Segmentation

SegDistill: Toolbox for Knowledge Distillation on Semantic Segmentation Networks This repo contains the supported code and configuration files for Seg

9 Dec 12, 2022
MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images This repository contains the implementation of our paper MetaAvatar: Learni

sfwang 96 Dec 13, 2022
Graph parsing approach to structured sentiment analysis.

Fine-grained Sentiment Analysis as Dependency Graph Parsing This repository contains the code and datasets described in following paper: Fine-grained

Jeremy Barnes 36 Dec 12, 2022
A Convolutional Transformer for Keyword Spotting

☢️ Audiomer ☢️ Audiomer: A Convolutional Transformer for Keyword Spotting [ arXiv ] [ Previous SOTA ] [ Model Architecture ] Results on SpeechCommands

49 Jan 27, 2022
Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

tyty 4 Aug 28, 2022