GitHub - jharishav99/Web-Scrapping: For this project, I chose to scrape content from the Student News Daily website. Using Python and these robust modules, I automated the extraction of articles, pulling crucial data such as headlines, publication dates, and article summaries directly from the site.

Here’s a brief overview of how each module contributed to the project's success:

1️⃣ Beautiful Soup: This module made parsing HTML and XML documents a breeze. It allowed me to navigate the HTML structure of each web page and extract the specific data elements I needed.

2️⃣ LXML Parser: Known for its speed and efficiency, LXML was instrumental in handling the parsing tasks, especially when dealing with large volumes of data. Its robustness ensured that my scraper could handle the intricacies of the website's markup.

3️⃣ Requests module: This module facilitated seamless HTTP requests, enabling my scraper to fetch web pages from Student News Daily without hassle. It managed the communication between my Python script and the web server flawlessly.

By combining these modules,

I created a streamlined web scraping tool that not only gathers information efficiently but also respects the website's protocols and ensures ethical data extraction practices.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
webscrapping.ipynb		webscrapping.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

jharishav99/Web-Scrapping

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages