Generative image models

Introduction

This repository provides an automated pipeline to train generative image models with "real-size" images, since most of the utilities that can be found are exemplified in playground datasets. With these tools, you will be able to manipulate standard-sized images (224) with custom encoders and decoders.

The main goals of this repository are:

Designing an efficient pipeline that can handle custom datasets and configurations.
Provide a baseline for generative image models.
Elaborate a component-based code where different dependencies, requirements and setups are met and can be used as template.

Repo structure

The repository has been divided in minimal components, for several reasons:

Optimisation of the service.
No need to go through all of them if it's out of the scope of your project. In this direction, it's important to mention that you can skip step I if you already provide your own dataset, under the code specifications.
Easiness to extend to Cloud pipelines (k8s, Vertex AI in GCP,...)

Each of those contains a structure like the following one:

Click here to find out!

├── src                                         # Compiled files (alternatively `dist`)
│   ├── dataset.py                              # Method that structures and transforms data
│   ├── loss.py                                 # Custom function to meet our needs during training
│   ├── layers.py                               # Auxiliary modules to build encoder and decoder
│   ├── model.py                                # Core script containing the architecture of the model
│   ├── fitter.py                               # Training wrapper
│   └── ...         
├── input                                       # Configuration files, datasets,...
│   ├── info.json                               # Configuration file for datasets information
│   ├── model_config.json                       # Configuration file for model architecture (depth, style,...)
│   ├── training_config.json                    # Configuration file for model training (batch_size, learning_rate,...)
│   ├── wandb_config.json                       # Credentiales for Weights and Biases API usage
│   ├── kaggle_config.json                      # Configuration file for Kaggle API (KAGGLE_USERNAME and KAGGLE_KEY)
│   └── DATASET_NAME                            # Image dataset containing a folder structure
│       ├── train 
│       │   ├── class_1                     
│       │   │   ├── image_1_class_1_train.png 
│       │   │   ├── image_2_class_1_train.png              
│       │   │   └── ...    
│       │   ├── class_2              
│       │   │   ├── image_1_class_2_train.png 
│       │   │   ├── image_2_class_2_train.png              
│       │   │   └── ...   
│       │   └── ...
│       └── val
│           ├── class_1                     
│           │   ├── image_1_class_1_val.png 
│           │   ├── image_2_class_1_val.png              
│           │   └── ...    
│           ├── class_2              
│           │   ├── image_1_class_2_val.png 
│           │   ├── image_2_class_2_val.png              
│           │   └── ...   
│           └── ...
├── main.py                                     # Main script to run the code of the component
├── Dockerfile                                  # Docker code to build an image encapsulating the main.py code
└── requirements.txt                            # Docker code to build an image encapsulating the code

Notions on generative models

Currently, VAEs and GANs are supported. To have a quick outlook of how these models look like and how are they trained, check the following images:

VAE: Current avilable versions are original VAE (paper), beta-VAE (paper) and disentangled beta-VAE (paper). Also, pixelwise and perceptual (inspired in this paper) losses can be chosen.

GAN: Current implementation follows the approach introduced in Improved Training of Wasserstein GANs (paper).

Use your own dataset

All the components make use of a specific dictionary that can be found in the input folder named info.json. Its keys are the names of the datasets that are to be specified in each main.py script as the data_flag parameter, and the values are also dictionaries containing relevant information. The necessary ones are:

Click here to find out!

├── img_size            # Size to which images will be resized.
├── python_class        # Name of the folder in which resized images will be stored.
├── url                 # Link where to download a `.zip` file containing a folder names as the previous parameter, and a folder structured as specified.
├── label               # Dictionary with containing numeric labels as keys and names of classes in folder structure as respective values.
├── n_channels          # 1 if images are grayscale or 3 if images are RGB.
└── n_samples           # Dictionary with train-val folders as keys, and number of images in each as values.

Plus, you have the option to use the Kaggle API to store your datasets. To do so, you should:

Update the kaggle_config.json file with your own KAGGLE_USERNAME and KAGGLE_KEY parameters.
Put your dataset with the specified structure in a zip file named as the data_flag parameter.
Update the info.json parameters with all the information, assigning to python_class the name of the folder inside the zip (DATASET_NAME).

By specifying those parameters, you should be ready to follow the steps contained in the next section. On it, we will make use of the already prepared birds400 dataset (make sure you clone it to your Kaggle account before using it). If data has already been downloaded, unzipped and stored in the input folder, you can ignore the Kaggle credentials.

Monitoring integration

These components have been integrated with Weights and Biases to track all metrics, hyperparameters, callbacks and GPU performance. You can check an example of train_VAE component monitoring here:

Quickstart code

You can start by using this notebook in which you can easily get up-to-speed with your own data and customise parameters.

License

Released under MIT by @hedrergudene.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
components		components
images		images
.gitignore		.gitignore
LICENSE		LICENSE
Quickstart.ipynb		Quickstart.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generative image models

Table of contents

Introduction

Repo structure

Notions on generative models

Use your own dataset

Monitoring integration

Quickstart code

License

About

Releases

Packages

Languages

License

hedrergudene/GenImgModels-Baseline

Folders and files

Latest commit

History

Repository files navigation

Generative image models

Table of contents

Introduction

Repo structure

Notions on generative models

Use your own dataset

Monitoring integration

Quickstart code

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages