Skip to content

Scraping information on public session of German Bundesrat (Federal Council) using Python/Selenium

Notifications You must be signed in to change notification settings

annerosenisser/bundesrat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bundesrat

This is a Python-based webscraper for collecting data on all public sessions of the German Bundesrat (Federal Council).

It scrapes this website containing session information on public sessions of the German Bundesrat in 2016 (as well as the ones for 2015 and 2014). I subsequently decided to collect all data for the time period 2002-2016.

More specifically, it collects information on the unique identifier of the session ("Beratungsvorgang") and on the committee(s) involved in the decision-making.

I first thought of using BeautifulSoup for this project, but after doing some research and reading this StackoverFlow post, I realized that Selenium would be a more adequate solution for the dynamic content of the website. Furthermore, I combine Selenium with the scrapy framework.

To run the spider, you should run

cd bundesrat/spiders
scrapy runspider scraper.py

inside the directory where this project is stored.


I've done this project as a favor to a colleague and an opportunity to practice my webscraping skills (and unexpectedly, it allowed me to learn Selenium!).

The data should be of interest to scholars of political science and public administration.

About

Scraping information on public session of German Bundesrat (Federal Council) using Python/Selenium

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages