You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When attempting to use the QR trick, I have noticed that the implementation differs from what is described in the original paper. In the paper, the embedding of a token is obtained by applying operations such as 'add', 'mult', or 'concat' to two separate embedding tables, after which sum or mean pooling is applied. However, the implementation
first applies pooling to the embeddings from the separate tables and then applies 'add', 'mult', or 'concat' to obtain the embedding feature. I am unsure whether this difference is by design or if the two methods are equivalent.
The text was updated successfully, but these errors were encountered:
Hi @fangleigit, good question! Since Criteo only supports one-hot encodings, we've implemented QR here this way. In this case, these two approaches are equivalent.
However, you are correct that ideally the implementation should perform the operation before the sum or mean pooling - this would require a more involved implementation that requires changing the underlying embedding bag implementation. We have not implemented this here.
When attempting to use the QR trick, I have noticed that the implementation differs from what is described in the original paper. In the paper, the embedding of a token is obtained by applying operations such as 'add', 'mult', or 'concat' to two separate embedding tables, after which sum or mean pooling is applied. However, the implementation
dlrm/tricks/qr_embedding_bag.py
Line 189 in c848e83
The text was updated successfully, but these errors were encountered: