A two-stage text-to-3D generation model. The first stage uses diffusion model to quickly generate candidates. The second stage refines the assets chosen from the first stage.
demo.mp4
[2024/03/10] Our captions for Objaverse is released here.
[2024/03/04] Our technical report is released here.
[2024/01/18] We release a text-to-3D model 3DTopia!
@article{hong20243dtopia,
title={3DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion Priors},
author={Hong, Fangzhou and Tang, Jiaxiang and Cao, Ziang and Shi, Min and Wu, Tong and Chen, Zhaoxi and Wang, Tengfei and Pan, Liang and Lin, Dahua and Liu, Ziwei},
journal={arXiv preprint arXiv:2403.02234},
year={2024}
}
We recommend using Anaconda to manage the environment.
conda env create -f environment.yml
Please refer to threefiner to install our second stage mesh refiner. We have tested installing both environments together with Pytorch 1.12.0 and CUDA 11.3.
We have implemented automatic checkpoint download for both gradio_demo.py
and sample_stage1.py
. If you prefer to download manually, you may download checkpoint 3dtopia_diffusion_state_dict.ckpt
or model.safetensors
from huggingface.
- If you encounter this error in the second stage
ImportError: /lib64/libc.so.6: version 'GLIBC_2.25' not found
, try to install a lower version of pymeshlab bypip install pymeshlab==0.2
.
Run the following command to sample a robot
as the first stage. Results will be located under the folder results
.
python -u sample_stage1.py --text "a robot" --samples 1 --sampler ddim --steps 200 --cfg_scale 7.5 --seed 0
Arguments:
--ckpt
specifies checkpoint file path;--test_folder
controls which subfolder to put all the results;--seed
will fix random seeds;--sampler
can be set toddim
for DDIM sampling (By default, we use 1000 steps DDPM sampling);--steps
controls sampling steps only for DDIM;--samples
controls number of samples;--text
is the input text;--no_video
and--no_mcubes
suppress rendering multi-view videos and marching cubes, which are by-default enabled;--mcubes_res
controls the resolution of the 3D volumn sampled for marching cubes; One can lower this resolution to save graphics memory;--render_res
controls the resolution of the rendered video;
There are two steps as the second stage refinement. Here is a simple example. Please refer to threefiner for more detailed usage.
# step 1
threefiner sd --mesh results/default/stage1/a_robot_0_0.ply --prompt "a robot" --text_dir --front_dir='-y' --outdir results/default/stage2/ --save a_robot_0_0_sd.glb
# step 2
threefiner if2 --mesh results/default/stage2/a_robot_0_0_sd.glb --prompt "a robot" --outdir results/default/stage2/ --save a_robot_0_0_if2.glb
The resulting mesh can be found at results/default/stage2/a_robot_0_0_if2.glb
We thank the community for building and open-sourcing the foundation of this work. Specifically, we want to thank EG3D, Stable Diffusion for their codes. We also want to thank Objaverse for the wonderful dataset.