-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training SegFormer model not working (goes through notebook, but model loss becomes nan) on dataset I created (stuck for a week or so) #459
Comments
Try lowering down the learning rate. |
I think I figured it out; the labels file I used had pixel value 255 as contrails, pixel value 1 as another ("filler") class, and pixel value 0 as unlabeled. But I think I needed to have a pixel value 2 as contrails, to have the pattern "0 1 2 3 ...". Sort of "closed," but this is a very dumb issue. Any way to fix it in the future? Shouldn't take too long to change some bits of code; especially as I was stuck on this for a week and a half. |
When trying to train a SegFormer model on this notebook, changing the variable ds to some contrails datasets that I have been sending to huggingface, such as this one, the model's loss turns to nan (and perhaps (?) it sometimes crashes after training the first epoch).
This does not occur when training segment.ai's sidewalks dataset. This may have something to do with some differences in my segmentation bitmaps or some issues with the duckdb files (the duckdb files seem to be formatted differently on the sidewalks dataset compared to my contails dataset).
Why does this occur?
(I obtained the contrails images from this competition's dataset.)
The text was updated successfully, but these errors were encountered: