Skip to content

Python code to scrape raw text from Wiley journals, particularly ESPL.

License

Notifications You must be signed in to change notification settings

sgrieve/PaperScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Paper Scraper

Code to streamline grabbing raw full text from Wiley journals using the crossref api.

The workflow takes a list of dois of papers in wiley journals, gets the metadata and then downloads the fulltext as a pdf. The pdf is then converted to plain text and dumped out to a file ready to be ingested by StanfordNLP for analysis.

SWDG -- 30 August

About

Python code to scrape raw text from Wiley journals, particularly ESPL.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages