An implementation of the BiDAF paper (with character level embeddings) for reading comprehension on the SqUAD v2 dataset. Also implemented an Answer Pointer network as per Match-LSTM and Answer Pointer to predict the final answer span. Achieves 78.83 F1 score, 69.45 EM score and 68% AvgNA score on the dev dataset.
-
Reading comprehension is a task in which a text passage (also called the context) is used to answer questions.
-
The answers are a span (continous subsequence) of the context itself. Thus the model predicts start and end indices of the answer in the context.
-
New to SQuAD v2, some questions are unanswerable ie., the context is insufficient to answer the question. The avgNA score of the model is the classification accuracy of the model in identifying such questions.
-
This implementation uses the BiDAF model, with an Answer Pointer network as the predictor head instead of the Output layer (as shown below).
-
The answer pointer network predicts the end index conditioned on the start index of the answer.
-
This is achieved by using the predicted start index probabilities as attention weights over the context.
-
The aggregrated features are then used to predict the end position.
- Only 325 of the total 442 articles in the SQuAD training dataset are used due to memory constraints.
- 64-D character embeddings are aggregrated using a 1-D convolutional filter of width 5 and stride 1.
- The answer pointer head uses dropout on the projected start pointer representation.
- A cosine learning rate scheduler is used.
- Hidden vectors in the BiDAF model are of 100 dimensions.
- Comparison of dev dataset negative log likelihood loss for this implementation (green) vs baseline model (blue)
setup.py
downloads and preprocesses the data.training_run.ipynb
contains the training and logging code (Trained using kaggle gpu resources).
- This repository uses the boilerplate code provided as part of the final project handout of the Stanford CS224n : NLP with deep learning course.