- Periodically check if there are any new items in news RSS feeds.
- If there are, store new article URL.
- Download articles, store automated extraction of news stories as well as the full HTML.
Place the RSS feeds in "feeds.txt" inside a "data" folder in the repository, each on their own line.
*/5 * * * * cd ~/news-article-collection/; python3 collect.py
*/30 * * * * cd ~/news-article-collection/; python3 process.py