Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add Moshi #914

Merged
merged 3 commits into from
Sep 20, 2024
Merged

chore: add Moshi #914

merged 3 commits into from
Sep 20, 2024

Conversation

Vaibhavs10
Copy link
Member

repoName: "Moshi",
repoUrl: "https://github.com/kyutai-labs/moshi",
filter: false,
countDownloads: `path:"tokenizer-e351c8d8-checkpoint125.safetensors"`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks super specific no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Since there's a mic of safetensors and GGUFs this points to the tokenizer for now - this is the same across all the repos, irrespective of the backend (candle, mlx, PyTorch) - it's the safest way I found without double counting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts about having

Suggested change
countDownloads: `path:"tokenizer-e351c8d8-checkpoint125.safetensors"`,
countDownloads: `path:"model.safetensors" OR path:"model.q4.safetensors" OR path:"model.q8.safetensors" OR path:"model.gguf" OR path:"model.q4.gguf" OR path:"model.q8.gguf"`,

?

It's long and ugly but at least if a new tokenizer is pushed, it won't break the download count. And it wouldn't duplicate counts if I checked correctly.

Otherwise I'm fine with the current solution, there is no perfect one anyway.

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add snippets?

@Vaibhavs10
Copy link
Member Author

should we add snippets?

Yes! I was thinking about it, but wasn't sure how best to support all three backends, since each has a different setup and installation 😅

@julien-c
Copy link
Member

but wasn't sure how best to support all three backends

if i'm not mistaken there's a way to specify multiple snippets now, even with titles/collapsible sections etc

@julien-c
Copy link
Member

can be in a separate PR though, if you prefer

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-approving just in case

@Wauplin
Copy link
Contributor

Wauplin commented Sep 20, 2024

Yes! I was thinking about it, but wasn't sure how best to support all three backends, since each has a different setup and installation 😅

Is it possible to add mlx or candle tags to the repos themselves and generate snippets based on that?

(agree can be done in a follow-up PR)

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for handling it @Vaibhavs10 :)

Approving "as is" (comments above are optional)

@Vaibhavs10
Copy link
Member Author

Cool! ah yeah! Good shout re: MLX + Candle tags, I'll add them tomorrow/ later today and make a follow-up PR 🤗

Thank you for the review and suggestions! ❤️

@Vaibhavs10 Vaibhavs10 merged commit 0c33e20 into main Sep 20, 2024
5 checks passed
@Vaibhavs10 Vaibhavs10 deleted the vb/add-moshi branch September 20, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants