SuperCLEVR Physics

A dynamical 3D scene understanding dataset for Video Question Answering. The scenes are annotated with objects' (1) static properties (shape, color) and (2) 3D dynamical properties (3D position, velocities, external forces), and (3) physical properties (mass, frictions, restitution); and Collision Event (objects involved, frame).

(Note: the color space is compressed for visualization)

Related works

SuperCLEVR. Visual question answering (VQA) dataset for domain robustness in four factors: visual complexity, question redundancy, concept distribution, concept compositionality.
SuperCLEVR-3D. A VQA dataset for 3D awareness scene understanding the objects from images including 3D poses, parts, and occlusions.

Video Question Answering

We design questions about the dynamical properties under 4D space of objects and their collision events.

There are types of questions: factual question, predictive question and counterfactual question from the generated scenes.

How to generate your own data

1. Environment

Setup Environment

Python version

We use python version 3.10. The python version will affect the compatibility of bpy packages.

Install Dependencies

Please use the following steps to install packages. Our project is built upon Kubric. We modified the original package to control more dynamical properties.

pip install -r requirements.txt

Install bpy

This is the python package for blender software, which is able to be installed from pip now. (PyPI, official site)

pip install bpy==3.5

If 3.5 is not applicable, 3.4 should also be compatible to this repo.

2. Video rendering

Run bash run.sh directly for new scene creation and video rendering.

Example of generating 100 videos.

time="$(date +%Y-%m-%d_%H-%M-%S)"
for num in {0..100}
do 
    CUDA_VISIBLE_DEVICES=xx python sim_render_color_defined_load_scene.py \
        --data_dir=assets \
        --job-dir=output/superclevr-physics \
        --scratch_dir=output/tmp/tmp-$time \
        --camera=fixed \
        --height=realistic \
        --iteration=$num \
        --scene_size 5 
done

The output folder will be like

output/superclevr-physics
└───super_clevr_0
│   └───events.json
|   └───metadata.json
|   └───rgba_00000.png
|   └───rgba_00001.png
|   └───...
|   └───rgba_00120.png
└───super_clevr_1
│   └───events.json
|   └───metadata.json
|   └───rgba_00000.png
|   └───rgba_00001.png
|   └───...
|   └───rgba_00120.png

Citation

@article{wang2024compositional,
  title={Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering},
  author={Wang, Xingrui and Ma, Wufei and Wang, Angtian and Chen, Shuo and Kortylewski, Adam and Yuille, Alan},
  journal={arXiv preprint arXiv:2406.00622},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
assets		assets
data		data
imgs		imgs
kubric		kubric
.DS_Store		.DS_Store
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
output.txt		output.txt
requirements.txt		requirements.txt
run.sh		run.sh
run_conterfactual.sh		run_conterfactual.sh
sim_render_color_defined_load_scene.py		sim_render_color_defined_load_scene.py
sim_render_conterfacual.py		sim_render_conterfacual.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SuperCLEVR Physics

Related works

Video Question Answering

How to generate your own data

1. Environment

Python version

Install Dependencies

Install bpy

2. Video rendering

Citation

About

Releases

Packages

Contributors 3

Languages

License

XingruiWang/SuperCLEVR-Physics

Folders and files

Latest commit

History

Repository files navigation

SuperCLEVR Physics

Related works

Video Question Answering

How to generate your own data

1. Environment

Python version

Install Dependencies

Install bpy

2. Video rendering

Citation

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages