Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does not have support for mistral, gemma, etc and generate error [BUG] ? #27

Closed
NickyDark1 opened this issue Mar 1, 2024 · 5 comments
Closed
Assignees
Labels
bug Something isn't working no-issue-activity

Comments

@NickyDark1
Copy link

NickyDark1 commented Mar 1, 2024

model_id = "h2oai/h2o-danube-1.8b-chat"#

image

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar
@NickyDark1 NickyDark1 added the bug Something isn't working label Mar 1, 2024
@NickyDark1
Copy link
Author

version: 4.36.2 new -> transformers==4.38.0 (no support)

@NickyDark1
Copy link
Author

only support this model?

Load a model from Hugging Face's Transformers

model_name = "bert-base-uncased"

@NickyDark1
Copy link
Author

no support:

  • cuda()
  • to("cuda:0")

@sanjeev-bhandari
Copy link

@NickyDark1, I ran that model in colab and it work

Without quanitizing

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("h2oai/h2o-danube-1.8b-chat")
model = AutoModelForCausalLM.from_pretrained("h2oai/h2o-danube-1.8b-chat")

# from transformers import pipeline

pipe = pipeline("text2text-generation", model=model, tokenizer=tokenizer)
pipe("Hello, How")

Output:

[{'generated_text': 'Hello, How are you?\n\n"I\'m doing well, thank you. How about'}]
After replacing Linear layer with bitnet
from bitnet import replace_linears_in_hf

replace_linears_in_hf(model)
# change model back to device cuda
model.to("cuda")
pipe_1_bit = pipeline("text-generation", model=model, tokenizer=tokenizer)
pipe_1_bit("Hello, How")

Output is:

[{'generated_text': 'Hello, How島 waters everyoneürgen Mess till revel馬 Vitt officials ambos">< czł plusieurs ap riv居'}]

But it takes ages to give this answer(8 min in my case in free colab).

Copy link

Stale issue message

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working no-issue-activity
Projects
None yet
Development

No branches or pull requests

3 participants