Replies: 1 comment 1 reply
-
I use RTX-3090, so the GPU memory is 24GB. The speed of learning on GPUs is achieved through parallelism with a certain batch size, so learning with small batch sizes is likely to be extremely slow. If you are training on a low-memory GPU, try the For more information on what checkpointing (do not confuse this with snapshots, which store progresses of learning) does, see for example https://pytorch.org/docs/stable/checkpoint.html. |
Beta Was this translation helpful? Give feedback.
-
This model is huge and takes a lot of gpu memory, so batch size can only be set a small number. May I ask which gpu do you use?
Beta Was this translation helpful? Give feedback.
All reactions