Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fully connected initialization #28

Open
Andreyisakov opened this issue Dec 7, 2016 · 5 comments
Open

fully connected initialization #28

Andreyisakov opened this issue Dec 7, 2016 · 5 comments

Comments

@Andreyisakov
Copy link

can you please explain why the fully connect layers weights are not initialize with MSRinit ,
how they are initialize ?

@Cadene
Copy link

Cadene commented Dec 7, 2016

which file? which line? :)

@Andreyisakov
Copy link
Author

Andreyisakov commented Dec 7, 2016 via email

@Cadene
Copy link

Cadene commented Dec 7, 2016

FCinit and MSRinit applied to WideResNet
FCinit and MSRinit code

I guess, it is just a matter of hyper parameter tuning. Maybe, the author could illuminate our thinking :p

@szagoruyko
Copy link
Owner

@Andreyisakov FC layers are initialized with Xavier, it doesn't affect the final accuracy https://github.com/torch/nn/blob/master/Linear.lua#L25
@Cadene thanks Remi

@dlmacedo
Copy link

dlmacedo commented Jan 3, 2017

Today I think we are using Xavier Initialization with Uniform Distribution (Default Torch) to Fully Connected Layers and Kaiming Initialization with Gaussian Distribution (MSRinit Function) to Convolutional Layers.

I don't see why not use the same Kaiming Initialization to both Convolutional and Fully Connected Layers, at least for the purpose of uniformity of treatment.

The following paper shows that Kaiming Initialization is supposed to be better than Xavier Initialization, at least to Convolutional Layers.

https://arxiv.org/pdf/1502.01852v1.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants