You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There seems to be an error when I call backward on a loss produced by the SPOPluss module, when the loss was called with a batched input. Take for instance the test case as defined below, here I call the SPOPluss module using a batched input. Here the initial forward pass works correctly but when I call backward it gives the following error:
ERROR: test_batched_spoplus (__main__.TestLoss)
Traceback (most recent call last):
File "<REST_OF_PATH>/tests/test_epo_spo_pluss.py", line 36, in test_batched_spoplus
loss.backward() # Here it errors
File "<REST_OF_PATH>/venv/lib/python3.10/site-packages/torch/_tensor.py", line 525, in backward
torch.autograd.backward(
File "<REST_OF_PATH>/venv/lib/python3.10/site-packages/torch/autograd/__init__.py", line 267, in backward
_engine_run_backward(
File "<REST_OF_PATH>/venv/lib/python3.10/site-packages/torch/autograd/graph.py", line 744, in _engine_run_backward
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "<REST_OF_PATH>/venv/lib/python3.10/site-packages/torch/autograd/function.py", line 301, in apply
return user_fn(self, *args)
File "<REST_OF_PATH>/venv/lib/python3.10/site-packages/pyepo/func/spoplus.py", line 120, in backward
return grad_output * grad, None, None, None, None
RuntimeError: The size of tensor a (2) must match the size of tensor b (4) at non-singleton dimension 1
Thank you so much for your detailed feedback and for providing the potential fix. I’m currently looking into the issue and will continue testing it to ensure everything works correctly. In our tests, such as the one in this notebook, we haven't encountered this error. You might want to take a look at it as well to see if there are any differences in the setup that could help.
However, I’ll make sure to review your case more thoroughly and update accordingly.
Sorry for the late reply, and thank you for the response.
But after looking at the notebook you shared and looking more into my own code I realize now why it is going wrong.
It seems like the SPO+ function takes a differently sized input tensor then I assumed.
Namely, I assumed that the true_obj tensor should be 1d with shape (batch_size) this is also what I did in the test above. However, it seems that the SPO+ implementation requires the true_obj tensor to be of shape (batch_size x 1).
If you feel that this is expected behavior feel free to close this issue and the pull request.
Hi PyEpo team,
There seems to be an error when I call backward on a loss produced by the
SPOPluss
module, when the loss was called with a batched input. Take for instance the test case as defined below, here I call the SPOPluss module using a batched input. Here the initial forward pass works correctly but when I call backward it gives the following error:I was able to fix this bug myself, and I think it is related to an incorrect shape in backward pass for SPO+.
To fix this bug you could change the backward function:
PyEPO/pkg/pyepo/func/spoplus.py
Lines 110 to 120 in 8cf389e
To the following:
Kind regards,
Jop
The text was updated successfully, but these errors were encountered: