Skip to content

Latest commit

 

History

History
112 lines (99 loc) · 7.38 KB

README.md

File metadata and controls

112 lines (99 loc) · 7.38 KB

POV-Surgery

A Dataset for Egocentric Hand and Tool Pose Estimation During Surgical Activities

26th International Conference on Medical Image Computing and Computer Assisted Intervention; MICCAI 2023,(Oral)

This is the official code release for POV Surgery at MICCAI 2023. banner Dataset project page Dataset report demo Statistics Check out the POVSurgery YouTube videos below for more details.

Video Description (with audio) Overview Video
LongVideo ShortVideo

Components

pipeline

Synthetic data generation pipeline

POV-Surgery dataset utilities

Fine-tuning demo code

Dataset Usage

Please download the dataset POV_Surgery_data.zip at POV-Surgery, unzip it and put it in a desired location. The depth scale convertion to meter is 5000. Please remember that if you wish to download and utilize our dataset, compliance with the licensing conditions is mandatory. Our proposed dataset contains 53 egocentric RGB-D sequences with 88k frames and accurate 2D/3D hand-object pose annotations. Here's a teaser of our dataset:

RGB-D and Annotation Dataset Overview
LongVideo ShortVideo

Project structure

Please register yourself at SMPL-X and MANO to use their dependencies. Please read and accept their liscenses to use SMPL-X and MANO models. There are different versions of manopth. We have included the implementation of mano in our repo already. Then please download the data.zip from POV-Surgery, unzip it and put in the POV_Surgery folder. We have prepared all the dependencies required and the final structure should look like this:

    POV_Surgery
    ├── data
    │    │
    │    ├── sim_room
    │          └── room_sim.obj
    │          └── room_sim.obj.mtl
    │          └── textured_output.jpg
    │    │
    │    └── bodymodel
    │          │
    │          └── smplx_to_smpl.pkl
    │          └── ...
    │          └── mano
    │                └── MANO_RIGHT.pkl
    │          └── body_models
    │                └── smpl
    │                └── smplx
    ├── grasp_generation
    ├── grasp_refinement
    ├── pose_fusion
    ├── pre_rendering
    ├── blender_rendering
    ├── HandOccNet_ft
    └── vis_data

Environment

We recommend create a python 3.8 environment with conda. Install pytorch and torchvision that suits you operation system. For example, if you are using cuda 11.8 version, you could use:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118

Then you should install pytorch3d that suits your python and cuda version. An example could be found here:

pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1110/download.html

Then install the dependencies to finish the environment set up following the requiremesnts.sh.

sh requirements.sh

You could refer to the colab demo for hint to set the environment.

Contact Information

If you have questions, feel free to contact:

Rui Wang: ruiwang46@ethz.ch

Acknowledgement

  • This work is part of a research project that has been financially supported by Accenture LLP. Siwei Zhang is funded by Microsoft Mixed Reality & AI Zurich Lab PhD scholarship. The authors would like to thank PD Dr. Michaela Kolbe for providing the simulation facilities.
  • The authors would like to thank David Botta, Dr. Kerrin Weiss, Isabelle Hofmann, Manuel Koch, Marc Wittwer for their participation in data capture and Dr. Julian Wolf, Tobias Stauffer and prof. Dr. Siyu Tang for the enlightening discussions.

License

Software Copyright License for non-commercial scientific research purposes. Please read carefully the terms and conditions and any accompanying documentation before you download and/or use the MANO model, data and software, (the "Model & Software"), including 3D meshes, blend weights, blend shapes, software, scripts, and animations. By downloading and/or using the Model & Software (including downloading, cloning, installing, and any other use of this github repository), you acknowledge that you have read these terms and conditions, understand them, and agree to be bound by them. If you do not agree with these terms and conditions, you must not download and/or use the Model & Software. Any infringement of the terms of this agreement will automatically terminate your rights under this License.

Citation

Wang, R., Ktistakis, S., Zhang, S., Meboldt, M., Lohmeyer, Q. (2023). POV-Surgery: A Dataset for Egocentric Hand and Tool Pose Estimation During Surgical Activities. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_42

BibTeX

@inproceedings{wang2023pov,
  title={POV-Surgery: A Dataset for Egocentric Hand and Tool Pose Estimation During Surgical Activities},
  author={Wang, Rui and Ktistakis, Sophokles and Zhang, Siwei and Meboldt, Mirko and Lohmeyer, Quentin},
  booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
  pages={440--450},
  year={2023}
}