You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 1, 2024. It is now read-only.
Hi !
I have problem when I use ESM-2 to embedding long protein sequence. For a long protein sequence, it needs to be cropped to a sequence with a length less than 1024, and BOS and EOS tokens are used to signal the beginning and end of a real protein.
My question is how to input a sequence that contains only a BOS or an EOS, or none of them?
Thanks in advance.
The text was updated successfully, but these errors were encountered:
You do not always need BOS and EOS tokens, even if you don’t have a transformer decoder. However, if you are fine-tuning ESM-2 for a specific downstream task, where you intend to use BOS and EOS tokens, then you would include them as special tokens.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi !
I have problem when I use ESM-2 to embedding long protein sequence. For a long protein sequence, it needs to be cropped to a sequence with a length less than 1024, and BOS and EOS tokens are used to signal the beginning and end of a real protein.
My question is how to input a sequence that contains only a BOS or an EOS, or none of them?
Thanks in advance.
The text was updated successfully, but these errors were encountered: