Text Summarisation using BERT and T5 on the Wikihow Dataset.

Transformer models have grown in popularity in recent years. This is primarily due to the self-attention mechanism in their architecture that gives differential weights to the significant portion of the text. This feature also allows parallelization compared to traditional methods, thereby reducing training time. Although all transformer based approaches are really good, there is still a challenge to decide which transformer model will perform better with a new dataset. In this paper, we propose a few experiments for text summarization using the WikiHow dataset. We use two models in a contest to outperform the other- BERT-large (extractive text summarizer) and T5-small (abstractive text summarizer). We conducted experiments on various text lengths and compared them using ROUGE scores. Following this, we conducted an experiment to determine which of the two models would yield better results in terms of information retrieval.

We use the WikiHow Text Summarization dataset:
Dataset: https://github.com/mahnazkoupaee/WikiHow-Dataset
Paper: https://arxiv.org/abs/1810.09305

The project Collaborators are as follows:

Anjali Pal (anjali.pal.21@ucl.ac.uk)
Langlang Fan (langlang.fan.21@ucl.ac.uk)
Kanupriya (kanupriya.21@ucl.ac.uk)
Vanessa Igodifo (vanessa.igodifo.21@ucl.ac.uk)

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
Code		Code
Dataset		Dataset
Instruction to Execute the Code		Instruction to Execute the Code
LaTeX Template		LaTeX Template
Literature Review		Literature Review
Report		Report
Video		Video
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Summarisation using BERT and T5 on the Wikihow Dataset.

This research is produced as a part of coursework for Statistical Natural Language Processing (COMP0087).

About

Releases

Packages

Contributors 4

Languages

License

Anjali001/Text_Summarisation_SNLP

Folders and files

Latest commit

History

Repository files navigation

Text Summarisation using BERT and T5 on the Wikihow Dataset.

This research is produced as a part of coursework for Statistical Natural Language Processing (COMP0087).

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages