Skip to content

Latest commit

 

History

History
28 lines (22 loc) · 2.05 KB

README.md

File metadata and controls

28 lines (22 loc) · 2.05 KB

course-report-summarization

   

Course report summarization is a project that uses pre-trained language models to generate summary of a lecture evaulation given in KLUE. We aim to provide useful and concise information to users more intuitively by giving an one-line summary for each lecture rather than reading the whole evaluations.

To give extractive summary from a given report, we studied through various summarization models including BART, T5, PEGASUS and more. Among these models, we choose to conduct experiments with BertSum, T5 and BART. BertSum is a simple variant of BERT, a pre-trained Transformer model, that is specifically used for extractive summarization. T5, text-to-text transfer transformer, is a model that reframes all NLP tasks into a unified text-to-text-format where the input and output are always text strings, in contrast to BERT-style models that can only output either a class label or a span of the input. BART is a denoising autoencoder for pretraining sequence-to-sequence models with BERT as its encoder and GPT as its decoder.

major Contributions

  • Study of papers and models for extractive/abstractive summarization
  • Extractive summarization using various models: BertSum, T5, BART
  • Crawling course evaluation data from KLUE: https://klue.kr/
  • Provision of one-line summary service to Korea university students

Directory

algorithms

  • bertsum: Bertsum 모델을 이용한 생성요약
  • ket5: T5 모델을 이용한 생성요약
  • kobart: KoBART 모델을 이용한 생성요약
  • crawling: KLUE(고려대학교 강의평 사이트) 크롤링 코드
  • preprocessing: 크롤링된 강의평 데이터 전처리 코드

Demo

Demo is now released with Flask !