Word counter PDF files

A program for counting the number of words(word tokenize) in PDF files.

It should be noted that this program does not detect scanned files.

How to run

To run this file; Just use steps below:

NLTK libraries are required.

If you want to install them on your system You must run the following code:

import nltk
nltk.download('stopwords')
nltk.download('punkt')

You must modify the filename variable to rename the input file:

filename = 'Your_file.pdf'

To change the number of output words, you must modify the variable count_word:

count_word = 30