Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passthrough Pipeline cache_dir param to underlying pipelines and get_model() #1722

Open
wants to merge 17 commits into
base: develop
Choose a base branch
from

Conversation

benniekiss
Copy link
Contributor

The Pipeline class takes a cache_dir param which sets where the pipeline config.yaml is downloaded, but the actual models used in the pipeline are downloaded to different locations, either the default huggingface cache location or the location set in the PYANNOTE_CACHE env_var, by default ~/.cache/torch/pyannote.

This PR makes it so that the cache_dir set in Pipeline is passed through to the underlying pipelines, get_model(), and hf_hub_download(). This way, when cache_dir is set, all models are downloaded to this directory. Now, by default, all models are downloaded to the PYANNOTE_CACHE env_var location, ~/.cache/torch/pyannote.

Of note, the Nemo framework does not allow setting the cache_dir dynamically, it can only be set via env_var.

This PR will break some functionality for users, specifically for models downloaded using hf_hub_download. Previously, they were downloaded to the default huggingface cache location; now, they will be downloaded to ~/.cache/torch/pyannote by default. This may require users to re-download models.

uses default huggingface cache location
previously, cache_dir was not passed through, so models would not download to the expected location passed in Pipeline
allow passing None as an optional parameter to cache_dir
fix docstring
Nemo api does not seem to take a save location parameter, so the cache cannot be changed

passthrough cache_dir

passthrough cache_dir param

passthrough cache_dir param

passthrough cache_dir param

fix type annotation

fix type annotation

fix type annotation

fix type annotation

fix type annotation

fix type annotation

fix type annotation
fix import for Path typing

fix import for Path typing

fix import for Path typing

fix import for Path typing

fix import for Path typing
empty strings should trigger default cache to be set

remove whitespace
formatting
formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant