-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How could I reproduce the result for SQuAD 1.1? #10
Comments
Hey @alphaf52 , could you find any solution for this? We are still facing the same issue. |
Hi, I think the problem is you forgot to give |
Our group can't reproduce the result for SQuAD 1.1 (as shown in Table 1 in the paper) from scratch either ! The README file does not give any interpretative statement on how to accomplish it. |
@yucoian at what point are you facing the problem? We could do it by following the steps given in the readme. |
@mittalpatel Thank you very much! In the "SQuAD v1.1 Experiments (Section 6.1)", we cannot reimplement the "DENSPI (dense only, with Coherency scalar)" model. Could you please tell us how to adapt your released code to reproduce the result of "DENSPI (dense only, with Coherency scalar)"? To be specific, after adding coherency scalar into DENSPI,we cannot reproduce the result. |
Hi,
Thanks for your good work. I would like to reproduce the result for SQuAD 1.1 (as shown in Table 1 in the paper), but I am having some troubles.
First, I downloaded the Pretrained Model from "gs://denspi/v1-0/model" and then tried to eval on dev-v1.1 using: "python run_piqa.py --do_predict --output_dir tmp --do_load --load_dir model --predict_file dev-v1.1.json --do_eval --gt_file dev-v1.1.json --metadata_dir bert"
The predicted answer seems to be random span, resulting in a metric like: {"exact_match": 0.47303689687795647, "f1": 4.43806570152543}. 0.47% EM means something is totally wrong.
I wonder whether I did it correctly.
And if I want to train a model to reproduce the result by myself, since I cannot get the Pretrained Model work, is it enough to just run the first step in the training section (i.e. "python run_piqa.py --train_batch_size 12 --do_train --freeze_word_emb --save_dir $SAVE1_DIR")
Thanks and hope to get your advice
The text was updated successfully, but these errors were encountered: