Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Topic key error when i try to train with the train_multihead.py script #3

Open
Nkonstan opened this issue Oct 1, 2021 · 3 comments

Comments

@Nkonstan
Copy link

Nkonstan commented Oct 1, 2021

File "train_multihead.py", line 202, in train
train_loader, val_loader, train_sampler, valid_sampler = get_data_loaders(config, tokenizer)
File "train_multihead.py", line 118, in get_data_loaders
topic = dialog["topic"]
KeyError: 'topic'

It doesn't exist 'topic' key in the format dataset you propose from what i understand ?

@roholazandie
Copy link
Owner

Make sure you are using the changed format of the dataset and not the original one. The changed one is here:
https://drive.google.com/open?id=1T4AdY7wku8srL_xWSxgt-OHqdLFVo3s3

if you are using this you will have the 'topic' key

@Nkonstan
Copy link
Author

Nkonstan commented Oct 3, 2021

Thanks for your answer, i confirm that i use this format you sent me, but it doesn't exist a "topic" key. You can print the json file to check it too.

@roholazandie
Copy link
Owner

Yes, You are right, I didn't upload that dataset. here is it:
https://drive.google.com/file/d/17nL6q3eiG4IKAZe-CregN5db4eGkgz4G/view?usp=sharing

let me know if this one works for you. I also update the readme file accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants