This is caused in multi-box loss. The sort method failed due to NaN numbers. This may be a bug in log_softmax
: pytorch/pytorch#14335 .Three ways to solve :
- Use a smaller warmup factor, like 0.1. (append
SOLVER.WARMUP_FACTOR 0.1
to your train cmd's end). - Use a longer warmup iters, like 1000. (append
SOLVER.WARMUP_ITERS 1000
to your train cmd's end). - Described in the forums by Jinserk Baik