Skip to content

Algorithms for extracting keywords from titles of Scientific Articles

License

Notifications You must be signed in to change notification settings

flppgg/Keywords

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Keywords

Algorithms for extracting keywords from titles of Scientific Articles

By combining the Natural Language Toolkit (NLTK) package, the Levenshtein algorithm and an ad-hoc algorithm, this script can:

  1. Given a list of Scientific Articles titles, extract potential good keywords from titles;
  2. Select the best keywords by looking at their relative frequency, and use them to create a thematic network of scientific publications.

This was written to scale well up to tens of millions of article titles, and millions of keywords. A few optimizations to the algorithm will be added in the following weeks.

This is just a beta project, you can find a visualization of a graph constructed using this algorithm here. Thanks to Anvaka for the excellent visualization engine!

About

Algorithms for extracting keywords from titles of Scientific Articles

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages