This repository contains the official code for our paper: Rethinking pose estimation in crowds: overcoming the detection information-bottleneck and ambiguity. [YouTube Video] [Website]
- Sep 2023: We released the code :)
- July 2023: This work is accepted to ICCV 2023 🎉
- June 2023: BUCTD was also presented at the 2023 CV4Animals workshop at CVPR
- June 2023: An earlier version can be found on arxiv
- This code will also be integrated in DeepLabCut!
We developed and tested our models with python=3.8.10, pytorch=1.8.0, cuda=11.1
. Other versions may also be suitable.
Instructions
- Clone this repo, and in the following we will call the directory that you cloned ${BUCTD_ROOT}.
git clone https://github.com/amathislab/BUCTD.git
cd ${BUCTD_ROOT}
- Install Pytorch and torchvision
Follow the instructions on https://pytorch.org/get-started/locally/.
# an example:
conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=11.1 -c pytorch -c conda-forge
- Install additional dependencies
pip install -r requirements.txt
- Install COCOAPI
# COCOAPI=/path/to/clone/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# Install into global site-packages
make install
# Alternatively, if you do not have permissions or prefer
# not to install the COCO API into global site-packages
python setup.py install --user
-
Install CrowdPoseAPI exactly in the same way as COCOAPI.
-
Install NMS
cd ${BUCTD_ROOT}/lib
make
Instructions
Generative sampling
You can use the script: train_BUCTD_synthesis_noise.sh
.
Empirical sampling
You can match your own bottom-up (BU) models by updating the scripts in ./data_preprocessing/.
If you do not want to match your own BU models for training, we provide the training annotations. You can download the annotations here.
During inference, we use different BU/one-stage model's predictions (e.g. PETR, CID) as Conditions. The result files can be downloaded from the link above.
We also provide the best model per human dataset along with the testing scripts.
Model | Sampling strategy | Image Size | Condition | AP | Weights | Script |
---|---|---|---|---|---|---|
BUCTD-preNet-W48 | Generative sampling | 384x288 | PETR | 77.8 | download | script |
Model | Sampling strategy | Image Size | Condition | AP_val | AP_test | Weights | Script |
---|---|---|---|---|---|---|---|
BUCTD-CoAM-W48 | Generative sampling (3x iterative refinement) | 384x288 | CID-W32 | 49.0 | 48.5 | download | script |
Model | Sampling strategy | Image Size | Condition | AP | Weights | Script |
---|---|---|---|---|---|---|
BUCTD-CoAM-W48 | Generative sampling | 384x288 | PETR | 78.5 | download | script |
We are grateful to the authors of HRNet, MIPNet, and TransPose as our code builds on their excellent work.
If you find this code or ideas presented in our work useful, please cite:
Rethinking pose estimation in crowds: overcoming the detection information-bottleneck and ambiguity (ICCV) by Mu Zhou*, Lucas Stoffl*, Mackenzie W. Mathis and Alexander Mathis (arxiv)
@InProceedings{Zhou_2023_ICCV,
author = {Zhou, Mu and Stoffl, Lucas and Mathis, Mackenzie Weygandt and Mathis, Alexander},
title = {Rethinking Pose Estimation in Crowds: Overcoming the Detection Information Bottleneck and Ambiguity},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {14689-14699}
}
BUCTD is released under the Apache 2.0 license. Please see the LICENSE file for more information.