Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] bitlinear fix #42

Closed
jayUyang opened this issue Mar 10, 2024 · 5 comments
Closed

[BUG] bitlinear fix #42

jayUyang opened this issue Mar 10, 2024 · 5 comments
Assignees
Labels
bug Something isn't working no-issue-activity

Comments

@jayUyang
Copy link

jayUyang commented Mar 10, 2024

beta and gamma sizes to be (1, weight.shape[0], not (weight.shape[0], 1) ???

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar
@jayUyang jayUyang added the bug Something isn't working label Mar 10, 2024
@kyegomez
Copy link
Owner

Can you elaborate please? Can you go deeper?

@Vipiao
Copy link

Vipiao commented Mar 14, 2024

I encountered the same problem. When passing a tensor of 4,2 int to a BitLinear(2,8), I get an error at the line
return x * self.gamma * self.beta / self.Q_b
Saying
"
Exception has occurred: RuntimeError
The size of tensor a (4) must match the size of tensor b (8) at non-singleton dimension 0
File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\bitlinear.py", line 112, in dequantize_activations_groupwise
return x * self.gamma * self.beta / self.Q_b
File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\bitlinear.py", line 137, in forward
output = self.dequantize_activations_groupwise(output)
File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\xor_test_bitlinear.py", line 20, in forward
x = self.layer1(x)
File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\xor_test_bitlinear.py", line 39, in
outputs = model(inputs) # Forward pass
RuntimeError: The size of tensor a (4) must match the size of tensor b (8) at non-singleton dimension 0
"
I think the shapes of the self.gamma and self.beta shapes are wrong. Gamma is initialized based on # output neuron shape but is set based on batch size

@zouyingcao
Copy link

I encountered the same problem. When passing a tensor of 4,2 int to a BitLinear(2,8), I get an error at the line return x * self.gamma * self.beta / self.Q_b Saying " Exception has occurred: RuntimeError The size of tensor a (4) must match the size of tensor b (8) at non-singleton dimension 0 File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\bitlinear.py", line 112, in dequantize_activations_groupwise return x * self.gamma * self.beta / self.Q_b File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\bitlinear.py", line 137, in forward output = self.dequantize_activations_groupwise(output) File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\xor_test_bitlinear.py", line 20, in forward x = self.layer1(x) File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\xor_test_bitlinear.py", line 39, in outputs = model(inputs) # Forward pass RuntimeError: The size of tensor a (4) must match the size of tensor b (8) at non-singleton dimension 0 " I think the shapes of the self.gamma and self.beta shapes are wrong. Gamma is initialized based on # output neuron shape but is set based on batch size

I think so, but I am confused that since self.gamma is related to activations while self.beta is related to weights, should we explicitly broadcast these two matrices [quantization about activations ('group_size = x.shape[0] // self.num_groups') should be grouped in the dim=1(x.shape[1]) because of the batch_size?], thus 'x * self.gamma * self.beta' in the dequantization process can do hadamard product? If I make wrong, pls point out. Thanks.

@zouyingcao
Copy link

I encountered the same problem. When passing a tensor of 4,2 int to a BitLinear(2,8), I get an error at the line return x * self.gamma * self.beta / self.Q_b Saying " Exception has occurred: RuntimeError The size of tensor a (4) must match the size of tensor b (8) at non-singleton dimension 0 File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\bitlinear.py", line 112, in dequantize_activations_groupwise return x * self.gamma * self.beta / self.Q_b File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\bitlinear.py", line 137, in forward output = self.dequantize_activations_groupwise(output) File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\xor_test_bitlinear.py", line 20, in forward x = self.layer1(x) File "C:\Users\Markus\OneDrive\phd\NYCU\research\bit_net\xor_test_bitlinear.py", line 39, in outputs = model(inputs) # Forward pass RuntimeError: The size of tensor a (4) must match the size of tensor b (8) at non-singleton dimension 0 " I think the shapes of the self.gamma and self.beta shapes are wrong. Gamma is initialized based on # output neuron shape but is set based on batch size

I think so, but I am confused that since self.gamma is related to activations while self.beta is related to weights, should we explicitly broadcast these two matrices [quantization about activations ('group_size = x.shape[0] // self.num_groups') should be grouped in the dim=1(x.shape[1]) because of the batch_size?], thus 'x * self.gamma * self.beta' in the dequantization process can do hadamard product? If I make wrong, pls point out. Thanks.

emmm, I see the owner update the new code. (without group quantization)

Copy link

Stale issue message

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working no-issue-activity
Projects
None yet
Development

No branches or pull requests

4 participants