Skip to content

kathabch/WomenInEP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

WomenInEP

An NLP approach analyzing European Parliament (EP) debates.

This repository accompanies my Master's Thesis "Do Women Matter? A Natural Language Processing Approach Analyzing European Parliament Debates". It provides information about how the data used was web scraped, transformed and analyzed.

Data used: debates of the EP for the period 1st of July 2014 to 10th of November 2022 (i.e. https://www.europarl.europa.eu/doceo/document/CRE-8-2014-07-01_EN.html)

repository used for web scraping: https://github.com/chozelinek/europarl

The steps taken are visualized in the following flowchart: flowchart_13-03-23

step by step data set creation:

notebooks for data analysis

  • word frequencies - general vizs.ipynb: getting an overview of the data set [- language detection of not translated interventions.ipynb: the attempt of translating the previously not translated interventions (spoiler: didn't work)]
  • topic modeling with spaCy.ipynb: topic modeling with spaCy to get an overview of the most important topics discussed within the given data set
  • word frequencies - topics.ipynb: analyzing pre-defined topics [- topic modeling with spaCy - Roe v Wade.ipynb: an attempt to find word patterns within the Roe v. Wade topic]

About

An NLP approach analyzing EP debates.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published