Predicting Political Lean of Articles and Analyzing Reader Behaviors Across Austria

Project Overview

This project was performed in partnership with the Austrian newspaper, derStandard, and therefore the data is not publicly available. The project aimed to predict the political leanings of newspaper articles provided by derStandard, and analyze the reading behaviors of readers across different regions in Austria. To achieve these objectives, a combination of natural language processing (NLP) techniques, machine learning models, and data analysis methods are used.

Technologies Used

Pandas & NumPy: Used for data manipulation and analysis.
xml.etree.ElementTree: Utilized for parsing XML data.
deep_translator: Applied for translating content from German to English.
Hugging Face's Transformers: Employed for implementing the pre-trained model for political lean prediction.
Matplotlib: Used for data visualization to present the results and insights.

Methodology

Data Preprocessing

The dataset was filtered by article category to include only articles with possible political content.
Special HTML characters and tags were removed from the article texts.
The articles were then parsed from XML format, extracting relevant features such as text, title, and publishing date.
User location was assigned based on the users most frequently occuring location.

This plot displays the location where users are accessing articles on the derStandard website. Note: as expected, the largest cities (Vienna, Linz, Graz, etc.) contain the largest share of clicks.

Political Lean Prediction

The political lean of each article was predicted using a pre-trained machine learning model (valurank/distilroberta-mbfc-bias).
Articles were translated from German to English using Google Translator API to accommodate the language requirements of the model.
The model classified articles into various political leans: left, left-center, least-biased, right-center, right, and unknown.

Data Analysis

The political leanings of the articles were analyzed along with their distribution across different channels and categories.
Cross-tabulation was performed to understand the distribution of political leans across different types of content.
A comprehensive analysis was conducted to explore the reading behaviors of readers across Austria, focusing on their preferences for articles with different political leans.

The following plot shows the predicted political lean by category. Note: blue and red indicate right and left leaning, respectively, as is represented in Austrian politics.

The next plot shows the political lean of articles read at different locations in Austria. Although the locations do not show a clear connection with political lean, this may be a result of how the location data is collected, the translation to English affecting the predictions, or other possibilities.

Results

The project successfully classified articles into different political leans, providing insights into the political orientation of content in derStandard.
The analysis of reading behaviors across Austria highlighted regional differences in political article preferences.
Visualization techniques were employed to present the distribution of political leans across various channels and categories, offering a clear overview of the political landscape in Austrian media.

If interested in the full results, please reach out!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
eda		eda
import		import
political_classification		political_classification
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting Political Lean of Articles and Analyzing Reader Behaviors Across Austria

Project Overview

Technologies Used

Methodology

Data Preprocessing

Political Lean Prediction

Data Analysis

Results

About

Releases

Packages

Languages

bhuebner3/Political-Bias-Detection

Folders and files

Latest commit

History

Repository files navigation

Predicting Political Lean of Articles and Analyzing Reader Behaviors Across Austria

Project Overview

Technologies Used

Methodology

Data Preprocessing

Political Lean Prediction

Data Analysis

Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages