Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix DecoderOnlyEmbedderICLSameDatasetTrainDataset category index error #1232

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

billvsme
Copy link

@billvsme billvsme commented Nov 15, 2024

When I was fine-tuning use the code in the documentation for bge-en-icl, an error occurred.

error like:

[rank2]:     batch_raw_data['category'][i],  # use category as example
[rank2]: IndexError: list index out of range

Looking at the source code, I found that it was due to the for loop i variable in DecoderOnlyEmbedderICLSameDatasetTrainDataset._create_batch_data being overridden by the new for loop i variable inside.

for i in range(len(batch_raw_data['query'])):
if data_type is not None:

for i in range(len(tmp_passages)):
tmp_passages[i] += self.args.icl_suffix_str

I modified the variable name of the inner for loop, i -> j, fix this bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant