Skip to content

Latest commit

 

History

History
106 lines (47 loc) · 3.1 KB

README.md

File metadata and controls

106 lines (47 loc) · 3.1 KB

Implementing Text Summarization using Deep Learning and Python

Roshin Nishad AI Project. 2022©

2020BCS0019

Customer reviews can often be long and descriptive. Analyzing these reviews manually, as you can imagine, is really time-consuming. This is where the brilliance of Natural Language Processing can be applied to generate a summary for long reviews.

Let’s first understand what text summarization is before we look at how it works. Here is a succinct definition to get us started:

“Automatic text summarization is the task of producing a concise and fluent summary while preserving key information content and overall meaning”

-Text Summarization Techniques: A Brief Survey, 2017

There are broadly two different approaches that are used for text summarization:

  • Extractive Summarization
  • Abstractive Summarization

Extractive Summarization

The name gives away what this approach does. We identify the important sentences or phrases from the original text and extract only those from the text.

Abstractive Summarization

This is a very interesting approach. Here, we generate new sentences from the original text. This is in contrast to the extractive approach we saw earlier where we used only the sentences that were present.

Some outputs from running the code ---

data['Text'][:10]

image

data['Summary'][:10]

image

for i in range(5): print("Review:",data['cleaned_text'][i]) print("Summary:",data['cleaned_summary'][i]) print("\n")

image

length_df = pd.DataFrame({'text':text_word_count, 'summary':summary_word_count}) length_df.hist(bins = 30) plt.show()

image

model.summary()

image

history=model.fit([x_tr,y_tr[:,:-1]], y_tr.reshape(y_tr.shape[0],y_tr.shape[1], 1)[:,1:],epochs=2,callbacks=[es],batch_size=1000, validation_data=([x_val,y_val[:,:-1]], y_val.reshape(y_val.shape[0],y_val.shape[1], 1)[:,1:]))

image

pyplot.plot(history.history['loss'], label='train') pyplot.plot(history.history['val_loss'], label='test') pyplot.legend() pyplot.show()

image

for i in range(30,100): print("Original Human-made Review:",seq2text(x_tr[i])) print("-------------Summary Below-------------") print("Predicted summary:",seq2summary(y_tr[i])) print("\n\n")

image

Thank You.

Roshin Nishad 2022©

2020BCS0019.

MIT Licensed.