Author: Amarpreet Kaur | 📧 Email | 🏫 Toronto Metropolitan University
This project introduces a novel framework for photorealistic style transfer, designed to significantly enhance image styling and transformation across multiple domains, including interior design, fashion, and digital marketing. By leveraging an innovative autoencoder architecture with block training, high-frequency residual skip connections, and bottleneck feature aggregation, this framework excels in separating and recombining the content and style of arbitrary images, leading to superior styling outcomes.
- Efficient Photorealistic Style Transfer: Achieves high-quality image transformations.
- Real-time Application: Supports dynamic adjustments suitable for virtual environments and interactive workflows.
- Advanced Neural Architecture: Utilizes a novel combination of neural network strategies for enhanced style transfer capabilities.
Here's a breakdown of the main directories and files in this repository:
.ipynb_checkpoints
: Stores checkpoint files.banner
: Contains banner images or graphics used in documentation.ckpts
: Includes model checkpoints from training sessions, allowing model restoration or reuse.dataset
: Directory for dataset, including links for both MSCOCO and ADE20K datasets, subdivided by training, validation, and test sets.figures
: Holds example images for testing and demonstration purposes. This includes content and style images used in style transfer demonstrations.final_add_3_decoder
: Contains scripts and models specific to theadd_3_decoder
architecture variant used for advanced decoding tasks.results
: Output directory where stylized images are saved after processing.resultsWCT(baseline)
: Stores baseline results using the Whitening and Coloring Transform (WCT) technique for comparison with the new method.utils
: Utility scripts including support for image processing, model definitions, and other helper functions.README.md
: The main documentation file providing an overview and instructions for using this repository.relu_demo.py
: A demonstration script that shows how to apply the style transfer using the ReLU model configurations.test.py
: Script for testing the models with different configurations and datasets.train.py
: Contains the training code for the autoencoder, detailing setup, execution, and options for various training regimes.
To train my autoencoder, I utilize the MSCOCO dataset, which has been widely used in various style transfer studies including some WCT (Whitening and Coloring Transform) papers. The dataset consists of: Training Set: 118,288 images, Validation Set: 5,000 images and Test Set: 40,670 images
For my initial semantic segmentation approach, I employ the ADE20K dataset. This dataset is instrumental in our experiments, featuring: Training Set: 25,574 images and Validation Set: 2,000 images
I applied BFA to the following model:
A pre-trained VGG-19 encoder (from input layer to relu_4_1 layer; fixed during training) and a blockwisely trained decoder which can reproduce the relu_3_1, relu_2_1, relu_1_1 features and the input image. The ZCA transformations are embedded at the bottleneck and the reproduced reluN1 layers in the decoder.
- The model is in utils/model_relu.py
and the associated checkpoint is in ckpts/ckpts-relu
.
- A demo that uses this model to stylize example images in figures/
is shown in relu_demo.py
. The resulting stylized images are in results/
.
Stylization with both models requires guided filtering in utils/photo_gif.py
as the post-processing.
To stylize a image put the content images in figures/content
and the style images in figure/style
then run relu_demo.py
- Add content images to
figures/content
. - Add style images to
figures/style
. - Execute:
python relu_demo.py
train.py
is the training code for our model. The usage is provided in the file.
Detailed usage instructions are inside the script.
- tensorflow v2.0.0 or above (I developed the models with tf-v2.4.1 and I also tested them in tf-v2.0.0)
- Python 3.x
- keras 2.0.x
- scikit-image
Please check out the FastPhotoStyle Tutorial.
- Ultrafast photorealistic style transfer via neural architecture search : Link
- Photowct2: Compact autoencoder for photorealistic style transfer resulting from blockwise training and skip connections of high-frequency residuals : Link
- A neural algorithm of artistic style : Link
- Universal style transfer via feature transformations : Link
If you find this code useful for your research, please cite:
@misc{kaur2024photorealism,
author = {Kaur, Amarpreet},
title = {Image Styling and Transformation},
publisher = {GitHub},
organization={Toronto Metropolitan University, Canada},
year = {2024},
howpublished = {\url{https://github.com/Amarpreet3/deep-learning-image-styling-and-transformation}}
}
Feel free to contact me if there is any question (Amarpreet Kaur amarpreet.kaur@torontomu.ca).
This README is designed to be straightforward, informative, and easy to navigate, providing all necessary details to understand, use, and contribute to the project effectively.