Understanding 3D Object Interaction from a Single Image

Code release for our paper

Understanding 3D Object Interaction from a Single Image
Shengyi Qian, David F. Fouhey
ICCV 2023

[Project Page] [arXiv] [demo]

Please check the project page for more details and consider citing our paper if it is helpful:

@inproceedings{qian2023understanding,
    title={Understanding 3D Object Interaction from a Single Image},
    author={Qian, Shengyi and Fouhey, David F},
    booktitle = {ICCV},
    year={2023}
}

If you are interested in the inference-only code, you can also try our demo code on hugging face.

Setup

We are using anaconda to set up the python environment. It is tested on python 3.9 and pytorch 2.0.1. pytorch3d is only required for 3D visualization.

# python
conda create -n monoarti python=3.9
conda activate monoarti

# pytorch
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

# other packages
pip install accelerate
pip install submitit
pip install hydra-core --upgrade --pre
pip install hydra-submitit-launcher --upgrade
pip install pycocotools
pip install packaging plotly imageio imageio-ffmpeg matplotlib h5py opencv-python
pip install tqdm wandb visdom

# (optional, for 3D visualization) pytorch3d
pip install "git+https://github.com/facebookresearch/pytorch3d.git"

Create checkpoints to store pretrained checkpoints.

mkdir checkpoints

If necessary, download our pretrained SAM model and put it at checkpoints/checkpoint_20230515.pth.

Dataset

The dataset is released in the project page. Please download and set the dataset root here.

The dataset should be organized like this

- `3doi_data`
    - `3doi_v1`
    - `images`
    - `omnidata_filtered`

Inference

To test the model on any 3DOI or other dataset, run

python test.py --config-name sam_inference checkpoint_path=checkpoints/checkpoint_20230515.pth output_dir=vis

To create video animation, run

python test.py --config-name sam_inference checkpoint_path=checkpoints/checkpoint_20230515.pth output_dir=vis test.mode='export_video'

Training

To train our model with segment-anything backbone,

python train.py --config-name sam

To train our model with DETR backbone,

python train.py --config-name detr

Acknowledgment

We reuse the code of ViewSeg, DETR and Segment-Anything.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Understanding 3D Object Interaction from a Single Image

Setup

Dataset

Inference

Training

Acknowledgment

Files

README.md

Latest commit

History

README.md

File metadata and controls

Understanding 3D Object Interaction from a Single Image

Setup

Dataset

Inference

Training

Acknowledgment