UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

Overview

UDP-Pose

This is the pytorch implementation for UDP++, which won the Fisrt place in COCO Keypoint Challenge at ECCV 2020 Workshop. Illustrating the performance of the proposed UDP

Top-Down

Results on MPII val dataset

Method--- Head Sho. Elb. Wri. Hip Kne. Ank. Mean Mean 0.1
HRNet32 97.1 95.9 90.3 86.5 89.1 87.1 83.3 90.3 37.7
+Dark 97.2 95.9 91.2 86.7 89.7 86.7 84.0 90.6 42.0
+UDP 97.4 96.0 91.0 86.5 89.1 86.6 83.3 90.4 42.1

Results on COCO val2017 with detector having human AP of 65.1 on COCO val2017 dataset

Arch Input size #Params GFLOPs AP Ap .5 AP .75 AP (M) AP (L) AR
pose_resnet_50 256x192 34.0M 8.90 71.3 89.9 78.9 68.3 77.4 76.9
+UDP 256x192 34.2M 8.96 72.9 90.0 80.2 69.7 79.3 78.2
pose_resnet_50 384x288 34.0M 20.0 73.2 90.7 79.9 69.4 80.1 78.2
+UDP 384x288 34.2M 20.1 74.0 90.3 80.0 70.2 81.0 79.0
pose_resnet_152 256x192 68.6M 15.7 72.9 90.6 80.8 69.9 79.0 78.3
+UDP 256x192 68.8M 15.8 74.3 90.9 81.6 71.2 80.6 79.6
pose_resnet_152 384x288 68.6M 35.6 75.3 91.0 82.3 71.9 82.0 80.4
+UDP 384x288 68.8M 35.7 76.2 90.8 83.0 72.8 82.9 81.2
pose_hrnet_w32 256x192 28.5M 7.10 75.6 91.9 83.0 72.2 81.6 80.5
+UDP 256x192 28.7M 7.16 76.8 91.9 83.7 73.1 83.3 81.6
+UDPv1 256x192 28.7M 7.16 77.2 91.6 84.2 73.7 83.7 82.5
+UDPv1+AID 256x192 28.7M 7.16 77.9 92.1 84.5 74.1 84.1 82.8
RSN18+UDP 256x192 - 2.5 74.7 - - - - -
pose_hrnet_w32 384x288 28.5M 16.0 76.7 91.9 83.6 73.2 83.2 81.6
+UDP 384x288 28.7M 16.1 77.8 91.7 84.5 74.2 84.3 82.4
pose_hrnet_w48 256x192 63.6M 14.6 75.9 91.9 83.5 72.6 82.1 80.9
+UDP 256x192 63.8M 14.7 77.2 91.8 83.7 73.8 83.7 82.0
pose_hrnet_w48 384x288 63.6M 32.9 77.1 91.8 83.8 73.5 83.5 81.8
+UDP 384x288 63.8M 33.0 77.8 92.0 84.3 74.2 84.5 82.5

Note:

  • Flip test is used.
  • Person detector has person AP of 65.1 on COCO val2017 dataset.
  • GFLOPs is for convolution and linear layers only.
  • UDPv1: v0:LOSS.KPD=4.0, v1:LOSS.KPD=3.5

Results on COCO test-dev with detector having human AP of 65.1 on COCO val2017 dataset

Arch Input size #Params GFLOPs AP Ap .5 AP .75 AP (M) AP (L) AR
pose_resnet_50 256x192 34.0M 8.90 70.2 90.9 78.3 67.1 75.9 75.8
+UDP 256x192 34.2M 8.96 71.7 91.1 79.6 68.6 77.5 77.2
pose_resnet_50 384x288 34.0M 20.0 71.3 91.0 78.5 67.3 77.9 76.6
+UDP 384x288 34.2M 20.1 72.5 91.1 79.7 68.8 79.1 77.9
pose_resnet_152 256x192 68.6M 15.7 71.9 91.4 80.1 68.9 77.4 77.5
+UDP 256x192 68.8M 15.8 72.9 91.6 80.9 70.0 78.5 78.4
pose_resnet_152 384x288 68.6M 35.6 73.8 91.7 81.2 70.3 80.0 79.1
+UDP 384x288 68.8M 35.7 74.7 91.8 82.1 71.5 80.8 80.0
pose_hrnet_w32 256x192 28.5M 7.10 73.5 92.2 82.0 70.4 79.0 79.0
+UDP 256x192 28.7M 7.16 75.2 92.4 82.9 72.0 80.8 80.4
pose_hrnet_w32 384x288 28.5M 16.0 74.9 92.5 82.8 71.3 80.9 80.1
+UDP 384x288 28.7M 16.1 76.1 92.5 83.5 72.8 82.0 81.3
pose_hrnet_w48 256x192 63.6M 14.6 74.3 92.4 82.6 71.2 79.6 79.7
+UDP 256x192 63.8M 14.7 75.7 92.4 83.3 72.5 81.4 80.9
pose_hrnet_w48 384x288 63.6M 32.9 75.5 92.5 83.3 71.9 81.5 80.5
+UDP 384x288 63.8M 33.0 76.5 92.7 84.0 73.0 82.4 81.6

Note:

  • Flip test is used.
  • Person detector has person AP of 65.1 on COCO val2017 dataset.
  • GFLOPs is for convolution and linear layers only.

Bottom-Up

HRNet

Arch P2I Input size Speed(task/s) AP Ap .5 AP .75 AP (M) AP (L) AR
HRNet(ori) T 512x512 - 64.4 - - 57.1 75.6 -
HRNet(mmpose) F 512x512 39.5 65.8 86.3 71.8 59.2 76.0 70.7
HRNet(mmpose) T 512x512 6.8 65.3 86.2 71.5 58.6 75.7 70.9
HRNet+UDP T 512x512 5.8 65.9 86.2 71.8 59.4 76.0 71.4
HRNet+UDP F 512x512 37.2 67.0 86.2 72.0 60.7 76.7 71.6
HRNet+UDP+AID F 512x512 37.2 68.4 88.1 74.9 62.7 77.1 73.0

HigherHRNet

Arch P2I Input size Speed(task/s) AP Ap .5 AP .75 AP (M) AP (L) AR
HigherHRNet(ori) T 512x512 - 67.1 - - 61.5 76.1 -
HigherHRNet T 512x512 9.4 67.2 86.1 72.9 61.8 76.1 72.2
HigherHRNet+UDP T 512x512 9.0 67.6 86.1 73.7 62.2 76.2 72.4
HigherHRNet F 512x512 24.1 67.1 86.1 73.6 61.7 75.9 72.0
HigherHRNet+UDP F 512x512 23.0 67.6 86.2 73.8 62.2 76.2 72.4
HigherHRNet+UDP+AID F 512x512 23.0 69.0 88.0 74.9 64.0 76.9 73.8

Note:

  • ori : Result from original HigherHrnet
  • mmpose : Pretrained models from mmpose
  • P2I : PROJECT2IMAGE
  • we use mmpose for codebase
  • the configurations of the baseline are HRNet-W32-512x512-batch16-lr0.001
  • Speed is tested with dist_test in mmpose codebase and 8 Gpus + 16 batchsize

Quick Start

(Recommend) For mmpose, please refer to MMPose

For hrnet, please refer to Hrnet

For RSN, please refer to RSN

Data preparation For coco, we provide the human detection result and pretrained model at BaiduDisk(dsa9)

Citation

If you use our code or models in your research, please cite with:

@inproceedings{cai2020learning,
  title={Learning Delicate Local Representations for Multi-Person Pose Estimation},
  author={Yuanhao Cai and Zhicheng Wang and Zhengxiong Luo and Binyi Yin and Angang Du and Haoqian Wang and Xinyu Zhou and Erjin Zhou and Xiangyu Zhang and Jian Sun},
  booktitle={ECCV},
  year={2020}
}
@article{huang2020joint,
  title={Joint coco and lvis workshop at eccv 2020: Coco keypoint challenge track technical report: Udp+},
  author={Huang, Junjie and Shan, Zengguang and Cai, Yuanhao and Guo, Feng and Ye, Yun and Chen, Xinze and Zhu, Zheng and Huang, Guan and Lu, Jiwen and Du, Dalong},
  year={2020}
}
Owner
Tsinghua University, Megvii Inc [email protected]
Some methods for comparing network representations in deep learning and neuroscience.

Generalized Shape Metrics on Neural Representations In neuroscience and in deep learning, quantifying the (dis)similarity of neural representations ac

Alex Williams 45 Dec 27, 2022
Video Frame Interpolation without Temporal Priors (a general method for blurry video interpolation)

Video Frame Interpolation without Temporal Priors (NeurIPS2020) [Paper] [video] How to run Prerequisites NVIDIA GPU + CUDA 9.0 + CuDNN 7.6.5 Pytorch 1

YoujianZhang 31 Sep 04, 2022
Proposed n-stage Latent Dirichlet Allocation method - A Novel Approach for LDA

n-stage Latent Dirichlet Allocation (n-LDA) Proposed n-LDA & A Novel Approach for classical LDA Latent Dirichlet Allocation (LDA) is a generative prob

Anıl Güven 4 Mar 07, 2022
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Sundararaman 76 Dec 06, 2022
CoINN: Correlated-informed neural networks: a new machine learning framework to predict pressure drop in micro-channels

CoINN: Correlated-informed neural networks: a new machine learning framework to predict pressure drop in micro-channels Accurate pressure drop estimat

Alejandro Montanez 0 Jan 21, 2022
A research toolkit for particle swarm optimization in Python

PySwarms is an extensible research toolkit for particle swarm optimization (PSO) in Python. It is intended for swarm intelligence researchers, practit

Lj Miranda 1k Dec 30, 2022
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

Introduction This repository contains my unofficial reimplementation of the standard ECAPA-TDNN, which is the speaker recognition in VoxCeleb2 dataset

Tao Ruijie 277 Dec 31, 2022
Efficient Speech Processing Tookit for Automatic Speaker Recognition

Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for

WangRui 14 Sep 14, 2022
Bringing Characters to Life with Computer Brains in Unity

AI4Animation: Deep Learning for Character Control This project explores the opportunities of deep learning for character animation and control as part

Sebastian Starke 5.5k Jan 04, 2023
DeepRec is a recommendation engine based on TensorFlow.

DeepRec Introduction DeepRec is a recommendation engine based on TensorFlow 1.15, Intel-TensorFlow and NVIDIA-TensorFlow. Background Sparse model is a

Alibaba 676 Jan 03, 2023
A library for Deep Learning Implementations and utils

deeply A Deep Learning library Table of Contents Features Quick Start Usage License Features Python 2.7+ and Python 3.4+ compatible. Quick Start $ pip

Achilles Rasquinha 1 Dec 12, 2022
Age and Gender prediction using Keras

cnn_age_gender Age and Gender prediction using Keras Dataset example : Description : UTKFace dataset is a large-scale face dataset with long age span

XN3UR0N 58 May 03, 2022
Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

Unsupervised-Multi-hop-QA This repository contains code and models for the paper: Unsupervised Multi-hop Question Answering by Question Generation (NA

Liangming Pan 70 Nov 27, 2022
Pytorch code for "Text-Independent Speaker Verification Using 3D Convolutional Neural Networks".

:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

Amirsina Torfi 114 Dec 18, 2022
Implementation for Homogeneous Unbalanced Regularized Optimal Transport

HUROT: An Homogeneous formulation of Unbalanced Regularized Optimal Transport. This repository provides code related to this preprint. This is an alph

Théo Lacombe 1 Feb 17, 2022
Rank 3 : Source code for OPPO 6G Data Generation Challenge

OPPO 6G Data Generation with an E2E Framework Homepage of OPPO 6G Data Generation Challenge Datasets H1_32T4R.mat H2_32T4R.mat Please put the original

Sen Pei 97 Jan 07, 2023
A solution to the 2D Ising model of ferromagnetism, implemented using the Metropolis algorithm

Solving the Ising model on a 2D lattice using the Metropolis Algorithm Introduction The Ising model is a simplified model of ferromagnetism, the pheno

Rohit Prabhu 5 Nov 13, 2022
JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation This the repository for this paper. Find extensions of this w

Zhuoyuan Mao 14 Oct 26, 2022
Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. KDD 2019.

gHHC Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, D

Nicholas Monath 35 Nov 16, 2022
Optimized Gillespie algorithm for simulating Stochastic sPAtial models of Cancer Evolution (OG-SPACE)

OG-SPACE Introduction Optimized Gillespie algorithm for simulating Stochastic sPAtial models of Cancer Evolution (OG-SPACE) is a computational framewo

Data and Computational Biology Group UNIMIB (was BI*oinformatics MI*lan B*icocca) 0 Nov 17, 2021