Time Magazine

Time Magazine Scraper, Text Extraction (OCR), and Data Exploration with Topic Modelling

01.ipynb: Code
Open in Colab to explore the topics (and their dominant terms) or run the code.

Part 1 : Scraping from Time Vault from 1923-2015.
Scraped Data

Part 2: Text Extraction with Tesseract OCR.
Currently, the text is extracted only from 2000-2015, since the process is slow.
And yes, extracted text has lots of noise.

Part 3: Data Exploration with Topic Modelling.
TODO: For all years, and interpretation.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
01.ipynb		01.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Time Magazine

About

Releases

Packages

Languages

License

The-Gupta/Time

Folders and files

Latest commit

History

Repository files navigation

Time Magazine

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages