We explore an application of the transformer architecture, question answering.
Question answering (QA) is a task of natural language processing that aims to automatically answer questions. The goal of extractive QA is to identify the portion of the text that contains the answer to a question. For example, when tasked with answering the question 'When will Jane go to Africa?' given the text data 'Jane visits Africa in September', the question answering model will highlight 'September'.
- We will fine-tune a pre-trained transformer model to a custom dataset to answer questions about stories.
- We will implement an extractive QA model in TensorFlow and in PyTorch.
I did this project in the Sequence Models course as part of the Deep Learning Specialization.
We use the QA bAbI dataset, which is one of the bAbI datasets generated by Facebook AI Research to advance natural language processing. After some data pre-processing to make it easier to see what the question and the answer are, one example of a story looks like:
{'answer': 'bedroom',
'question': 'What is north of the garden?',
'sentences': 'The garden is north of the office. The bedroom is north of the garden.',
'story.answer': ['', '', 'bedroom'],
'story.id': ['1', '2', '3'],
'story.supporting_ids': [[], [], ['2']],
'story.text': ['The garden is north of the office.',
'The bedroom is north of the garden.',
'What is north of the garden?'],
'story.type': [0, 0, 1]}
We tokenize the input using the 🤗 DistilBERT fast tokenizer to match the pre-trained DistilBERT transformer model we are using.
For training withe the QA bAbI dataset, we use two different implementations: one in TensorFlow and one in PyTorch.