Reference-Based 3D-Aware Image Editing with Triplanes

Bahri Batuhan Bilecen, Yigit Yalin, Ning Yu, and Aysegul Dundar

Generative Adversarial Networks (GANs) have emerged as powerful tools for high-quality image generation and real image editing by manipulating their latent spaces. Recent advancements in GANs include 3D-aware models such as EG3D, which feature efficient triplane-based architectures capable of reconstructing 3D geometry from single images. However, limited attention has been given to providing an integrated framework for 3D-aware, high-quality, reference-based image editing. This study addresses this gap by exploring and demonstrating the effectiveness of the triplane space for advanced reference-based edits. Our novel approach integrates encoding, automatic localization, spatial disentanglement of triplane features, and fusion learning to achieve the desired edits. Additionally, our framework demonstrates versatility and robustness across various domains, extending its effectiveness to animal face edits, partially stylized edits like cartoon faces, full-body clothing edits, and 360-degree head edits. Our method shows state-of-the-art performance over relevant latent direction, text, and image-guided 2D and 3D-aware diffusion and GAN methods, both qualitatively and quantitatively.

🛠️ Requirements and installation

Make sure you have 64-bit Python 3.8, PyTorch 11.1 (or above), and CUDA 11.3 (or above).
Preferably, create a new environment via conda or venv and activate the environment.
Install pip dependencies: pip install -r requirements.txt

✂️ Dataset preparation

We follow EG3D's dataset preparation for pose extraction and face alignment. Make sure that you do not skip the setup of Deep3DFaceRecon_pytorch. Then, run in-the-wild preprocessing code:

cd ./dataset_preprocessing/ffhq
python preprocess_in_the_wild.py --indir=YOUR_INPUT_IMAGE_FOLDER

This will generate aligned images and a dataset.json containing camera matrices in YOUR_INPUT_IMAGE_FOLDER/preprocessed/.

We have included example images and poses in ./example/.

🏁 Checkpoints

Put all downloaded files in ./checkpoints/.

Network	Filename
EG3D rebalanced generator	`ffhqrebalanced512-128.pkl`
EG3D-GOAE encoders	`encoder_FFHQ.pt` & `afa_FFHQ.pt`
Finetuned fusion encoder	`encoder_FFHQ_finetuned.pt`
BiSeNet segmentation	`79999_iter.pth`
IR-SE50 for ID loss	`model_ir_se50.pth`

🚀 Quickstart

Run demo.ipynb for various editing examples.

👇 Citation

Our codebase utilizes the following great works: EG3D, EG3D-GOAE, TriPlaneNetv2, BiSeNet, and Deep3DFaceRecon_pytorch. We thank the authors for providing them.

@misc{bilecen2024referencebased,
      title={Reference-Based 3D-Aware Image Editing with Triplanes}, 
      author={Bahri Batuhan Bilecen and Yigit Yalin and Ning Yu and Aysegul Dundar},
      year={2024},
      eprint={2404.03632},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Reference-Based 3D-Aware Image Editing with Triplanes

🛠️ Requirements and installation

✂️ Dataset preparation

🏁 Checkpoints

🚀 Quickstart

👇 Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Reference-Based 3D-Aware Image Editing with Triplanes

🛠️ Requirements and installation

✂️ Dataset preparation

🏁 Checkpoints

🚀 Quickstart

👇 Citation