bundesrat

This is a Python-based webscraper for collecting data on all public sessions of the German Bundesrat (Federal Council).

It scrapes this website containing session information on public sessions of the German Bundesrat in 2016 (as well as the ones for 2015 and 2014). I subsequently decided to collect all data for the time period 2002-2016.

More specifically, it collects information on the unique identifier of the session ("Beratungsvorgang") and on the committee(s) involved in the decision-making.

I first thought of using BeautifulSoup for this project, but after doing some research and reading this StackoverFlow post, I realized that Selenium would be a more adequate solution for the dynamic content of the website. Furthermore, I combine Selenium with the scrapy framework.

To run the spider, you should run

cd bundesrat/spiders
scrapy runspider scraper.py

inside the directory where this project is stored.

I've done this project as a favor to a colleague and an opportunity to practice my webscraping skills (and unexpectedly, it allowed me to learn Selenium!).

The data should be of interest to scholars of political science and public administration.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
R		R
bundesrat		bundesrat
data		data
.gitignore		.gitignore
README.md		README.md
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bundesrat

About

Releases

Packages

Languages

annerosenisser/bundesrat

Folders and files

Latest commit

History

Repository files navigation

bundesrat

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages