KAPAO is an efficient multi-person human pose estimation model that detects keypoints and poses as objects and fuses the detections to predict human poses.

Overview

KAPAO (Keypoints and Poses as Objects)

KAPAO is an efficient single-stage multi-person human pose estimation model that models keypoints and poses as objects within a dense anchor-based detection framework. When not using test-time augmentation (TTA), KAPAO is much faster and more accurate than previous single-stage methods like DEKR and HigherHRNet:

alt text

This repository contains the official PyTorch implementation for the paper:
Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation.

Our code was forked from ultralytics/yolov5 at commit 5487451.

Setup

  1. If you haven't already, install Anaconda or Miniconda.
  2. Create a new conda environment with Python 3.6: $ conda create -n kapao python=3.6.
  3. Activate the environment: $ conda activate kapao
  4. Clone this repo: $ git clone https://github.com/wmcnally/kapao.git
  5. Install the dependencies: $ cd kapao && pip install -r requirements.txt
  6. Download the trained models: $ sh data/scripts/download_models.sh

Inference Demos

Note: FPS calculations includes all processing, including inference, plotting / tracking, image resizing, etc. See demo script arguments for inference options.

Flash Mob Demo

This demo runs inference on a 720p dance video (native frame-rate of 25 FPS).

alt text

To display the inference results in real-time:
$ python demos/flash_mob.py --weights kapao_s_coco.pt --display --fps

To create the GIF above:
$ python demos/flash_mob.py --weights kapao_s_coco.pt --start 188 --end 196 --gif --fps

Squash Demo

This demo runs inference on a 1080p slow motion squash video (native frame-rate of 25 FPS). It uses a simple player tracking algorithm based on the frame-to-frame pose differences.

alt text

To display the inference results in real-time:
$ python demos/squash.py --weights kapao_s_coco.pt --display --fps

To create the GIF above:
$ python demos/squash.py --weights kapao_s_coco.pt --start 42 --end 50 --gif --fps

COCO Experiments

Download the COCO dataset: $ sh data/scripts/get_coco_kp.sh

Validation (without TTA)

  • KAPAO-S (63.0 AP): $ python val.py --rect
  • KAPAO-M (68.5 AP): $ python val.py --rect --weights kapao_m_coco.pt
  • KAPAO-L (70.6 AP): $ python val.py --rect --weights kapao_l_coco.pt

Validation (with TTA)

  • KAPAO-S (64.3 AP): $ python val.py --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-M (69.6 AP): $ python val.py --weights kapao_m_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-L (71.6 AP): $ python val.py --weights kapao_l_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1

Testing

  • KAPAO-S (63.8 AP): $ python val.py --scales 0.8 1 1.2 --flips -1 3 -1 --task test
  • KAPAO-M (68.8 AP): $ python val.py --weights kapao_m_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1 --task test
  • KAPAO-L (70.3 AP): $ python val.py --weights kapao_l_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1 --task test

Training

The following commands were used to train the KAPAO models on 4 V100s with 32GB memory each.

KAPAO-S:

python -m torch.distributed.launch --nproc_per_node 4 train.py \
--img 1280 \
--batch 128 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5s6.pt \
--project runs/s_e500 \
--name train \
--workers 128

KAPAO-M:

python train.py \
--img 1280 \
--batch 72 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5m6.pt \
--project runs/m_e500 \
--name train \
--workers 128

KAPAO-L:

python train.py \
--img 1280 \
--batch 48 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5l6.pt \
--project runs/l_e500 \
--name train \
--workers 128

Note: DDP is usually recommended but we found training was less stable for KAPAO-M/L using DDP. We are investigating this issue.

CrowdPose Experiments

  • Install the CrowdPose API to your conda environment:
    $ cd .. && git clone https://github.com/Jeff-sjtu/CrowdPose.git
    $ cd CrowdPose/crowdpose-api/PythonAPI && sh install.sh && cd ../../../kapao
  • Download the CrowdPose dataset: $ sh data/scripts/get_crowdpose.sh

Testing

  • KAPAO-S (63.8 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_s_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-M (67.1 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_m_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-L (68.9 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_l_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1

Training

The following commands were used to train the KAPAO models on 4 V100s with 32GB memory each. Training was performed on the trainval split with no validation. The test results above were generated using the last model checkpoint.

KAPAO-S:

python -m torch.distributed.launch --nproc_per_node 4 train.py \
--img 1280 \
--batch 128 \
--epochs 300 \
--data data/crowdpose.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5s6.pt \
--project runs/cp_s_e300 \
--name train \
--workers 128 \
--noval

KAPAO-M:

python train.py \
--img 1280 \
--batch 72 \
--epochs 300 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5m6.pt \
--project runs/cp_m_e300 \
--name train \
--workers 128 \
--noval

KAPAO-L:

python train.py \
--img 1280 \
--batch 48 \
--epochs 300 \
--data data/crowdpose.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5l6.pt \
--project runs/cp_l_e300 \
--name train \
--workers 128 \
--noval

Acknowledgements

This work was supported in part by Compute Canada, the Canada Research Chairs Program, the Natural Sciences and Engineering Research Council of Canada, a Microsoft Azure Grant, and an NVIDIA Hardware Grant.

If you find this repo is helpful in your research, please cite our paper:

@article{mcnally2021kapao,
  title={Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation},
  author={McNally, William and Vats, Kanav and Wong, Alexander and McPhee, John},
  journal={arXiv preprint arXiv:2111.08557},
  year={2021}
}

Please also consider citing our previous works:

@inproceedings{mcnally2021deepdarts,
  title={DeepDarts: Modeling Keypoints as Objects for Automatic Scorekeeping in Darts using a Single Camera},
  author={McNally, William and Walters, Pascale and Vats, Kanav and Wong, Alexander and McPhee, John},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4547--4556},
  year={2021}
}

@article{mcnally2021evopose2d,
  title={EvoPose2D: Pushing the Boundaries of 2D Human Pose Estimation Using Accelerated Neuroevolution With Weight Transfer},
  author={McNally, William and Vats, Kanav and Wong, Alexander and McPhee, John},
  journal={IEEE Access},
  volume={9},
  pages={139403--139414},
  year={2021},
  publisher={IEEE}
}
Owner
Will McNally
PhD Candidate
Will McNally
Like Dirt-Samples, but cleaned up

Clean-Samples Like Dirt-Samples, but cleaned up, with clear provenance and license info (generally a permissive creative commons licence but check the

TidalCycles 39 Nov 30, 2022
An Intelligent Self-driving Truck System For Highway Transportation

Inceptio Intelligent Truck System An Intelligent Self-driving Truck System For Highway Transportation Note The code is still in development. OS requir

InceptioResearch 11 Jul 13, 2022
Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Dominic Rampas 247 Dec 16, 2022
[CVPR 2021] Region-aware Adaptive Instance Normalization for Image Harmonization

RainNet — Official Pytorch Implementation Region-aware Adaptive Instance Normalization for Image Harmonization Jun Ling, Han Xue, Li Song*, Rong Xie,

130 Dec 11, 2022
Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

Fastformer-PyTorch Unofficial PyTorch implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Usage : import t

Hong-Jia Chen 126 Dec 06, 2022
Hyperbolic Image Segmentation, CVPR 2022

Hyperbolic Image Segmentation, CVPR 2022 This is the implementation of paper Hyperbolic Image Segmentation (CVPR 2022). Repository structure assets :

Mina Ghadimi Atigh 46 Dec 29, 2022
Code for the ICCV'21 paper "Context-aware Scene Graph Generation with Seq2Seq Transformers"

ICCV'21 Context-aware Scene Graph Generation with Seq2Seq Transformers Authors: Yichao Lu*, Himanshu Rai*, Cheng Chang*, Boris Knyazev†, Guangwei Yu,

Layer6 Labs 37 Dec 18, 2022
This is the official implementation for "Do Transformers Really Perform Bad for Graph Representation?".

Graphormer By Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng*, Guolin Ke, Di He*, Yanming Shen and Tie-Yan Liu. This repo is the official impl

Microsoft 1.3k Dec 26, 2022
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

GCNet for Object Detection By Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu. This repo is a official implementation of "GCNet: Non-local Networ

Jerry Jiarui XU 1.1k Dec 29, 2022
Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Simple-Image-Classification Simple Image Classification Code (PyTorch) Yechan Kim This repository contains: Python3 / Pytorch code for multi-class ima

Yechan Kim 8 Oct 29, 2022
RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation Ported from https://github.com/hzwer/arXiv2020-RIFE Dependencies NumPy

49 Jan 07, 2023
CN24 is a complete semantic segmentation framework using fully convolutional networks

Build status: master (production branch): develop (development branch): Welcome to the CN24 GitHub repository! CN24 is a complete semantic segmentatio

Computer Vision Group Jena 123 Jul 14, 2022
An official implementation of "Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation" (CVPR 2021) in PyTorch.

BANA This is the implementation of the paper "Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation". For more inf

CV Lab @ Yonsei University 59 Dec 12, 2022
[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Rethinking Minimal Sufficient Representation in Contrastive Learning PyTorch implementation of Rethinking Minimal Sufficient Representation in Contras

36 Nov 23, 2022
BboxToolkit is a tiny library of special bounding boxes.

BboxToolkit is a light codebase collecting some practical functions for the special-shape detection, such as oriented detection

jbwang1997 73 Jan 01, 2023
Convert ONNX model graph to Keras model format.

Convert ONNX model graph to Keras model format.

Grigory Malivenko 175 Dec 28, 2022
This repository contains the source code of our work on designing efficient CNNs for computer vision

Efficient networks for Computer Vision This repo contains source code of our work on designing efficient networks for different computer vision tasks:

Sachin Mehta 386 Nov 26, 2022
Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

Ibai Gorordo 15 Oct 14, 2022
PyTorch implementation of neural style transfer algorithm

neural-style-pt This is a PyTorch implementation of the paper A Neural Algorithm of Artistic Style by Leon A. Gatys, Alexander S. Ecker, and Matthias

770 Jan 02, 2023
Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Vision Transformer with Progressive Sampling This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

yuexy 123 Jan 01, 2023