Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with GPU training (Windows) #84

Open
gagan144 opened this issue Aug 6, 2018 · 0 comments
Open

Issue with GPU training (Windows) #84

gagan144 opened this issue Aug 6, 2018 · 0 comments

Comments

@gagan144
Copy link

gagan144 commented Aug 6, 2018

Hi,
I am training on GPU and the loss seems to suddenly shoot up when reaching to certain steps and it then never decreases. However, if I train it only on CPU only, the training is perfectly fine with nicely decreasing loss values. I have repeated this several times both on GPU & CPU and surprisingly the loss always shoots at the same step no ~800.

My hardware specs are:
Windows 10 (64-bit)
GPU: Nvidia GTX 1050 - 4GB; Also tried on Nvidia GTX 1060 with max-Q design - 6GB
RAM: 16 GB
Processor: i7 7th Generation
Tensorflow: tensorflow-gpu==1.8.0

Following are the loss graphs:
With GPU
aocr_gpu

With CPU
acor_cpu

I have tried:

  • Tried 5 times. Let it train for 24 hours on GPU. Loss never decreases after step 800, keeps oscillating around 3.4.
  • I have train the same on CPU for 3 days; around 64k steps, I am able to see good results. It somehow just the GPU that doesn't work.
  • Normalizing training batch by reshuffling data.
  • Reinstall tensorflow
  • All other models like object detection (rcnn resnet, mobilenet etc), classification (inception-v3 etc) and many more run perfectly on GPU. I am facing problem in attention-ocr only.

It is something to do with OS. Is this code compatible with linux only as far as GPU is concerned?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant