Skip to content

Running into an issue: The exllama kernel for GPTQ requires a float16 input activation, while torch.float32 was passed #77

Answered by samvanity
samvanity asked this question in Q&A
Discussion options

You must be logged in to vote

change line 59 in prompt_compressor.py to:

torch_dtype=torch.float16 if device_map == "cuda" else torch.float32,

solves the problem.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@iofu728
Comment options

Answer selected by samvanity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants