HackerNews-Scraper

A simple web scraper to scrape the Hacker News(HN) website for news at https://news.ycombinator.com

Parameters:

pages: Number of pages one wants the HackerNews for, this creates one file for each page, and a maximum of only 20 pages can be fetched for now.

verbose: Enable or disable verbose output by Y/N, if Y, then progress is printed to the terminal when each page is fetched, else, the program runs silently.

First, please install the dependencies for this scraper by using the requirements.txt file

pip install -r requirements.txt

To use this for your daily share of HackerNews headlines, please clone and use the HackerNews.py file

git clone https://github.com/bharatr21/HackerNews-Scraper.git

Future Scope:

Add support to extract a small snippet/preview of text from each article
Add Multiprocess support in future, making it as an optional argument
Add support to fetch more pages

Any contributions to this Project are always Welcome!!

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
HackerNews.py		HackerNews.py
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HackerNews-Scraper

About

Releases

Packages

Contributors 2

Languages

License

bharatr21/HackerNews-Scraper

Folders and files

Latest commit

History

Repository files navigation

HackerNews-Scraper

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages