-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The output of BitLinear is quite abnormal #35
Comments
The implementation of this binear is completely wrong, not only does it not follow the process outlined in the Bitnet paper, but it also misunderstands all the computational principles. I don't understand why it still receives so many stars. |
Gemma, beta, and alpha are calculated using weights and input before quantization. These parameters are then utilized for weights binarization and input quantization. The binarized weights and quantized input undergo linear operations to produce the output, which is then dequantized using the previously calculated gemma, beta. It's not meaningful to calculate gemma and beta separately for quantization and dequantization stages, and even the implementation of grouping here is entirely nonsensical. |
hi, I don't understand what u say. Could u tell more? |
Another implementation is BIT-Transformers. I don't know how its BitLinear works, especially the forward function. No obvious beta/gamma and no need to dequant output. Could u understand this code? Thanks |
The issues I mentioned have been addressed in the commit 6cdb2ea. |
Yes, most of the problem has been addressed. Still got a bug in the implementation of grouping. I am working on that. |
Describe the bug
I print the mean and variance of the tensor y in example.py.
Its mean and variance are abnormal, as follows:
To make sure, I print the mean and variance of outputs from Linear and BitLinear, simutaneously.
I believe there are mistakes in the implementation of BitLinear in bitnet/bitlinear.py.
To Reproduce
Steps to reproduce the behavior:
output_linear = torch.nn.functional.linear(x, self.weight, self.bias)
in bitnet/bitlinear.py line 129. Then print the mean and variance of output_linearUpvote & Fund
The text was updated successfully, but these errors were encountered: