SNGAN

Unofficial implementation of SNGAN in PyTorch

Spectral Normalization for Generative Adversarial Networks
https://arxiv.org/abs/1802.05957

Reimplementation

Experiments mostly based on the paper.

CIFAR-10 (Standard CNN)

Fixed conditions

Generator: Standard CNN
Discriminator : Standard CNN
n_epochs : 321 (approx. 50k G updates)
n_dis = 5
Adam parameters : lr=0.0002, beta1=0.0, beta2 = 0.9
Using training data (10 classes, 50k images)

Changing conditions

Case	0	1	2	3
Loss	Cross Entropy	Hinge	Cross Entropy	Hinge
Conditional	FALSE	FALSE	TRUE	TRUE
Inception Score	5.844	6.077	6.094	5.821

Incpetion score log

Sampling and Interpolation

Case 0

IS = 5.844

Case 1

IS = 6.077

Case 2

IS = 6.094 (Best)

Case 3

IS = 5.821

CIFAR-10 (ResNet)

Fixed conditions

Generator: ResNet (32x32)
Discriminator : ResNet (32x32)
n_epochs : 321 (approx. 50k G updates)
n_dis = 5
Adam parameters : lr=0.0002, beta1=0.5, beta2 = 0.9
Using training data (10 classes, 50k images)

Note: beta1=0.0 failed to converge (seems to stuck into saddle points).

Changing conditions

Case	0	1	2	3
Loss	Cross Entropy	Hinge	Cross Entropy	Hinge
Conditional	FALSE	FALSE	TRUE	TRUE
Inception Score	3.916	5.962	3.908	5.900

Hinge loss performs well.

Incpetion score log

Sampling and Interpolation

Case 0

IS = 3.916

Case 1

IS = 5.962 (Best)

Case 2

IS = 3.908

Case 3

IS = 5.900

STL-10 (Standard)

Fixed conditions

Generator: Standard CNN (48x48)
Discriminator : Standard CNN (48x48)
n_epochs : 1301 (approx. 53k G updates)
n_dis = 5
Hinge loss
Adam parameters : lr=0.0002, beta2 = 0.9
Using training + test data (10 classes, 13k (5k + 8k) images)

Changing conditions

Case	0	1	2	3
Conditional	FALSE	FALSE	TRUE	TRUE
Beta1	0.5	0	0.5	0
IS	5.932	6.058	6.157	5.813

GANs in STL-10 is bit difficult than CIFAR-10 (Too sensitive to hyper parameters.).

Incpetion score log

Sampling and Interpolation

Case 0

IS = 5.932

Case 1

IS = 6.058

Case 2

IS = 6.157 (Best)

Case 3

IS = 5.813

STL-10 (Res Net)

Fixed conditions

Generator: Res Net (48x48)
Discriminator : Res Net (48x48)
n_epochs : 1301 if n_dis=5, 261 if n_dis=1 (approx. 53k G updates)
Hinge loss
Adam parameters : lr=0.0002, beta1=0.5, beta2 = 0.9
Using training + test data (10 classes, 13k (5k + 8k) images)

Changing conditions

Case	0	1	2	3	4	5	6	7
Conditional	FALSE	TRUE	FALSE	TRUE	FALSE	TRUE	FALSE	TRUE
N dis update	5	5	1	1	5	5	1	1
D last ch	512	512	512	512	1024	1024	1024	1024
IS	5.795	7.136	3.421	3.883	6.538	7.314	3.444	3.653

Trainings in n_dis = 1 are very difficult. Some additional experiments are written in appendix.

Incpetion score log

Sampling and Interpolation

Case 0

IS = 5.795

Case 1

IS = 7.136 (2nd)

Case 2

IS = 3.421

Case 3

IS = 3.883

Case 4

IS = 6.538

Case 5

IS = 7.314 (Best)

Case 6

IS = 3.444

Case 7

IS = 3.653

Additional Implementation

Experiments with datasets that were not written in the paper.

AnimeFace Character Dataset

http://www.nurs.or.jp/~nagadomi/animeface-character-dataset/

14490 images with 176 classes. Much less images per class than CIFAR-10 and STL-10.

Fixed conditions

Generator: Res Net (96x96)
Discriminator : Res Net (96x96)
n_epochs : 1301 (approx. 53k G updates)
Hinge loss, n_dis = 5
Adam parameters : lr=0.0002, beta1=0.5, beta2 = 0.9

The network architecture is based on a 128x128 paper and the resolution is changed to 96x96 (dense : 4x4x1024 -> 3x3x1024).

Changing conditions

Case 0 = unconditional
Case 1 = conditional

Inception score was not measured. Because the domain is different and pre-trained inception is useless.

Sampling and interpolation uses the final epoch model.

Case 0 (Un-conditional)

Case 1 (Conditional)

Oxford Flower Dataset

http://www.robots.ox.ac.uk/~vgg/data/flowers/

8189 images with 102 classes. Much less images per class than CIFAR-10 and STL-10.

Changing conditions

Case 0 = unconditional
Case 1 = conditional

Sampling and interpolation uses the final epoch model.

Case 0 (Un-conditional)

Case 1 (Conditional)

It seems that the generation of flowers is still difficult (outlines are too complicated).

Appendix (More expertiments on STL-10)

The goal is to train with n_dis = 1, but there are many negative results

STL-10 (Post-act Res Net / Standard CNN)

Fixed conditions

Generator: Post-act Res Net (48x48)
Discriminator : Standard CNN (48x48)
Generator leraning rate : 0.0002

For generator, ResNet implementation in the paper was pre-act ResNet (BN / SN-> ReLU-> Conv), but change this to post-act (Conv-> BN / SN-> ReLU).

For discriminator, simply used Standard CNN to reduce computation.

Changing conditions

Case	0	1	2	3	4	5	6	7	8	9
N dis	5	5	1	1	1	1	1	1	1	1
Beta2 in Adam	0.9	0.9	0.9	0.9	0.999	0.999	0.999	0.999	0.999	0.999
Leaky relu slope in D	0.1	0.1	0.1	0.1	0.1	0.1	0.2	0.2	0.2	0.2
D learning rate	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.001	0.001
Conditional	FALSE	TRUE	FALSE	TRUE	FALSE	TRUE	FALSE	TRUE	FALSE	TRUE
Inception Score	6.419	5.663	1.285	2.634	2.342	2.447	2.499	2.722	2.355	2.544

case 8, 9 = inbalanced learning rate, G:0.0002, D=0.001

Case	Condition	IS
0	uncoditional n_dis = 5	6.419
2	uncoditional n_dis = 1	1.285
4	+ beta2 = 0.999	2.342
6	+ lrelu slope = 0.2	2.499
8	+ lr_d = 0.001	2.355

Case	Condition	IS
1	coditional n_dis = 5	5.663
3	coditional n_dis = 1	2.634
5	+ beta2 = 0.999	2.447
7	+ lrelu slope = 0.2	2.722
9	+ lr_d = 0.001	2.544

It's not good at training with n_dis = 1. Other tunings are also subtle.

Incpetion score log

Sampling and Interpolation

Only a part is described. The rest, please refer to them in the repository folder.

Case 0

IS = 6.419

Case 1

IS = 5.663

Case 6

IS = 2.499

Case 7

IS = 2.722

Why n_dis=1 don't work ?

These are loss plot of case 0, 6 (both uncoditional). Case 0(n_dis=5) works well, but case 6(n_dis=1) does not.

As you can see, the loss of D in case 6 has hardly decreased.　If it doesn’t work, G is too strong to learn nothing. This is due to set Spectral Norm in D.

Therefore, n_dis=5 which increases D update is easier to work.

STL-10 (Post-act Res Net / Post-act Res Net)

Fixed conditions

Generator: Post-act Res Net (48x48)
Discriminator : Post-act Res Net (48x48, initial ch changeable)
Generator leraning rate : 0.0002
n_epochs : 1301 if n_dis=5, 261 if n_dis=1 (approx. 53k G updates)
Adam parameters : beta1=0.5, beta2 = 0.999

Changing conditions

Case	0	1	2	3	4	5
D initial ch	16	16	32	32	64	64
N dis	5	1	5	1	5	1
Inception Score	4.621	3.004	4.789	4.365	4.990	4.341

Incpetion score log

Case 0

IS = 4.621

Case 2

IS = 4.789

Case 4

IS = 4.990

STL-10 (ResNet in the paper / some changed Res Net)

Fixed conditions

Generator: Res Net in the paper (48x48)
Generator leraning rate : 0.0002
n_dis : 5
n_epochs : 1301
All conditional
Adam parameters : lr=0.0002, beta1=0.5

Changing conditions

Case	0	1	2	3
Beta2 in Adam	0.9	0.999	0.9	0.999
D architecture	postact	postact	strided resnet	strided resnet
Inception Score	5.955	5.708	6.906	7.222

Strided ResNet is the original ResNet D with stride conv at the beginning. This reduces the amount of computation.

Incpetion score log

Case 0

IS = 5.955

Case 1

IS = 5.708

Case 2

IS = 6.906

Case 3

IS = 7.222 (Best)

Discussion

Larger D model will improve IS slightly. But in any case, I could not find a way to generate clearly with n_dis = 1.

In image classification, the diffrence between pre-act and post-act is slightly. But in gan, the difference seems to be quite large.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
figure		figure
graph		graph
models		models
sampling_interpolation		sampling_interpolation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inception_score.py		inception_score.py
losses.py		losses.py
summary.py		summary.py
torchsummary.py		torchsummary.py
train_animeface.py		train_animeface.py
train_cifar.py		train_cifar.py
train_cifar_resnet.py		train_cifar_resnet.py
train_flower.py		train_flower.py
train_stl.py		train_stl.py
train_stl_resnet.py		train_stl_resnet.py
train_stl_resnet_dchange.py		train_stl_resnet_dchange.py
train_stl_resnet_postact.py		train_stl_resnet_postact.py
train_stl_resnet_postact2.py		train_stl_resnet_postact2.py

License

koshian2/SNGAN

Folders and files

Latest commit

History

Repository files navigation

SNGAN

Reimplementation

CIFAR-10 (Standard CNN)

Fixed conditions

Changing conditions

Incpetion score log

Sampling and Interpolation

Case 0

Case 1

Case 2

Case 3

CIFAR-10 (ResNet)

Fixed conditions

Changing conditions

Incpetion score log

Sampling and Interpolation

Case 0

Case 1

Case 2

Case 3

STL-10 (Standard)

Fixed conditions

Changing conditions

Incpetion score log

Sampling and Interpolation

Case 0

Case 1

Case 2

Case 3

STL-10 (Res Net)

Fixed conditions

Changing conditions

Incpetion score log

Sampling and Interpolation

Case 0

Case 1

Case 2

Case 3

Case 4

Case 5

Case 6

Case 7

Additional Implementation

AnimeFace Character Dataset

Fixed conditions

Changing conditions

Case 0 (Un-conditional)

Case 1 (Conditional)

Oxford Flower Dataset

Changing conditions

Case 0 (Un-conditional)

Case 1 (Conditional)

Appendix (More expertiments on STL-10)

STL-10 (Post-act Res Net / Standard CNN)

Fixed conditions

Changing conditions

Incpetion score log

Sampling and Interpolation

Case 0

Case 1

Case 6

Case 7

Why n_dis=1 don't work ?

STL-10 (Post-act Res Net / Post-act Res Net)

Fixed conditions

Changing conditions

Incpetion score log

Case 0

Case 2

Case 4

STL-10 (ResNet in the paper / some changed Res Net)

Fixed conditions

Changing conditions

Incpetion score log

Packages