Skip to content

Incorrect results of pallas.ops.gpu.attention.mha when seq_len is not divisible by block_q. Is this expected behavior, or is it a bug? #23818

Unanswered
lkwq007 asked this question in Q&A
Discussion options

You must be logged in to vote

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant