-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue with CodeCarbon lock #265
Conversation
@IlyasMoutawwakil We need to merge this fix in Optimum to make the CLI CUDA Torch-ORT Multi- and Single-GPU tests pass: huggingface/optimum#2028 |
For the CLI ROCm Pytorch Multi- and Single-GPU tests , the error is:
Not sure why the Intel extension for PyTorch is installed for this test as I don't think we need it. Should we explicitly uninstall it in the workflow @IlyasMoutawwakil ? |
the failing tests are from the ongoing changes in #263 |
Acquiring the lock and releasing it between processes will probably not work (or cause issues) in multi-gpu/process setting ( |
True for multi-process this is needed. I wouldn't set |
Tbh it would be much better if we have one tracker in distributed settings |
From CodeCarbon v2.7, a lock file is introduced (see here) to check whether another instance of codecarbon is running. If yes, an error is raised. One has to call
my_energy_tracker.stop()
to release the lock file.This PR adds a
stop
method in theEnergyTracker
class that should be called once the energy tracker is not needed anymore.It also updates an import from
huggingface_hub
since v0.25 introduced a change (the new import is backward-compatible).Fixes #260.