This repository provides the final data journalism piece and all the codes used behind for the data science project World Media's International Coverage, which is part of the evaluation for the completion of the course Introduction to Data Science taught by Simon Munzert at the Hertie School, Berlin, in Fall 2021. This project was developed by Sofia Lai, Andrew Wells and Marina Luna.
The final project is available as html page: "Lai-Luna-Wells-final-project.html". Please download the file to visualize it.
- "Archive scraping" contains scripts for scraping headlines as well as stored scraped headlines from the following news sources:
- Daily Mail
- Frankfurter Allgemeine Zeitung (script and list of URLs only)
- The Hindu
- The Times of India
- LA Times
- Le Monde
- Der Spiegel
- Süddeutsche Zeitung
- The Times (script only)
-
"Country lists" cointains scripts for the list of names of countries and some world capitals in English, French and German, which were used in the text cross-referecing.
-
"Daily Scraping" contains scripts and stored headlines for the news sources:
- Dominion Post
- Le Figaro
- Fox News
- The Guardian
- The Irish Independent
- The Irish Times
- The New Zealand Herald
-
"Graphs" contains the scripts and data for all the interactive graphs.
-
"Map file" contains scripts for georeferencing of the countries in the interactive map.
-
"News scripts" contains scripts for the counting of mentions in each newspaper.
-
"News source df's" contains the consolidated data for each newspaper.
-
"Shiny apps" contains two different folders for each application. "Countries" contains the script and data for the applications "Countries mentioned by media sources". "Frequency-of-mentions" contains the script and data for the application "Frequency of country mentioned by media sources".
-
"data_set_full.csv" is the full data frame used for the analysis.
The material in this repository is made available under the MIT license.
The codes for the web scraping and the graphs were developed mainly by Sofia Lai, who also provided significant support for transferring the application to the RStudio's server.
The codes for the text cross-referencing, the consolidation of the data frame and the codes for the interactive dashboards in the applications were developed mainly by Andrew Wells.
The written analysis was developed mainly by Marina Luna.