GPU resources not being freed during long pytest suite #10296
Replies: 5 comments 4 replies
-
Yeah the objects returned by the fixtures are cached by pytest while they are needed.
pytest does not interact explicitly with PyTorch, so my guess is that the objects are being cached in long-lived fixtures (session/module/class) or by your own code. |
Beta Was this translation helpful? Give feedback.
-
Strange, because the fixture is cleaned up when the last test in a module executes (regardless if that test uses the fixture or not). Example: # content of test_1.py
import pytest
@pytest.fixture(scope="module")
def fix():
print("fix setup")
yield
print("fix teardown")
def test_a(fix):
pass
def test_b():
pass
# content of test_2.py
def test_c():
pass
def test_d():
pass
I'm not familiar with your code, but don't you have to explicitly call a
As far as I can tell, if pytest is correctly calling the "teardown" portion of the fixture, then pytest is doing the right thing; the fixture should cleanup after itself at that point. Sorry if I can't be of more help than that. 😕 |
Beta Was this translation helpful? Give feedback.
-
@AngledLuffa I think this discussion might answer your question of why the models persist on the GPU despite the scope and garbage collection: #10387 |
Beta Was this translation helpful? Give feedback.
-
Thanks for following up! I will give that a try next time we run out of GPU. We had upgraded our test machine for unrelated reasons, and that was after I had condensed the tests enough to fit in the previous GPU, so this might not be an issue until the tests get bigger again. |
Beta Was this translation helpful? Give feedback.
-
Again, thanks. I ran into this when adding a new feature & a test for that feature, and |
Beta Was this translation helpful? Give feedback.
-
I have a pytest suite which takes roughly 10 minutes and allocates many objects on the GPU using pytorch. In some cases, they are apparently freed after a test finishes, but in others the objects persist. The result is that by the end of the test suite, so many pytorch objects are still on the GPU that even simple operations run out of memory.
Is there a good solution for this? I was able to reuse some objects by turning them into fixtures, saving enough memory to get the test suite to run, but even when the objects are module or class scoped fixtures they seem to stick around on the GPU quite often.
I also tried
but that didn't free anything as far as I could tell.
Is this an interaction between pytorch and pytest, or are the objects themselves not being deleted?
For reference, the test suite is here
Thanks in advance
Beta Was this translation helpful? Give feedback.
All reactions