-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor ArtifactManager and restart TaskManager #529
Conversation
6fd26a1
to
0fc4e4a
Compare
@@ -297,6 +296,32 @@ def create_app() -> FastAPI: | |||
return app | |||
|
|||
|
|||
def load_dataset_split_managers_from_config( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed to move this function because of circular import issues.
One big change that I made here is to use an ArtifactManager
in the main thread. I was afraid that multiple dms were created otherwise.
dm: DatasetSplitManager, | ||
results: List[List[PerturbedUtteranceResult]], | ||
pipeline_index: int, | ||
config: AzimuthConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was causing an error in a test. We shouldn't rely on the config of the dataset split manager, because it only depends on the project_hash
.
@@ -63,29 +63,6 @@ def get_module_data(simple_text_config): | |||
) | |||
|
|||
|
|||
def test_clearing_cache(tiny_text_config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we need a new test to replace this one - but it didn't look relevant anymore. Replacing task_manager.clear_worker_cache()
by task_manager.restart()
would work, but then the test takes a lot of time (~50 seconds), and I'm not sure it is testing something relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand the assertions any([m and d for m, d in cached.values()])
. I leave it up to you! Saving 50 seconds does sound tempting if it's useless.
4402ef6
to
5f81471
Compare
0fc4e4a
to
a3f8412
Compare
a3f8412
to
228f0ac
Compare
a93803f
to
4d33219
Compare
228f0ac
to
4095a73
Compare
4d33219
to
69b7949
Compare
4095a73
to
3ecfa72
Compare
69b7949
to
99178a8
Compare
3ecfa72
to
c5fbc3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I don't understand all the details, but I guess we'll confirm that everything works by using it!
* Add logging to help with config update debugging * Restart client to free memory * Refactor artifact manager to real singleton and remove clear_cache * Adapt based on comments
Fixes #521
Description:
ArtifactManager
is now a real Singleton (1 per process). 3 are created -> 1 per worker and 1 in the main thread.Checklist:
You should check all boxes before the PR is ready. If a box does not apply, check it to acknowledge it.
ran
pre-commit run --all-files
at the end.our users.
README
files and our wiki for any big design decisions, if relevant.