Skip to content

Collaborative project exploring UK road accident and vehicle data to determine which factors contribute to accident risk. Used libraries such as seaborn and matplotlib to visualise and analyse data from 2 CSV's imported into jupyter notebook.

Notifications You must be signed in to change notification settings

JessicaUppal/UK-road-safety-analysis

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Which factors contribute to accident risk?

Title Image

This is a team based project that explored a traffic accident data set.

Contents

Dataset

We used the UK Road Safety: Traffic Accidents and Vehicles
Detailed dataset of road accidents and involved vehicles in the UK (2005-2017).
Available from Kaggle.com

There are 2 CSV files in this data set.

  • Accident_Information.csv
  • Vehicle_Information.csv

Both files should be placed in the Resources/ Directory:

KaggleData

Both CSV files were merged into a single dataframe. The resulting data file was extremely large so a decision was made to focus on Years 2010-2016. This data was filtered and placed into a New CSV all.CSV which is the main data used for all the investigation, analysis and plots.

Data Limitations
It must be noted that the data was limited in scope. Therefore, despite some interesting findings, the plots extracted from the data although "true", do not "tell the entire story".

Project Outline

We decided our client question would be "which factors contribute to accident risk?" and used this question to formulate some hypotheses and used these to target the data relevant to our hypotheses and attempt to turn that raw data in to meaningful information.

  • Hypothesis 1: Greater volume of traffic increases the number of accidents
  • Hypothesis 2: Urban areas have a greater number of accidents than rural areas
  • Hypothesis 3: The time or day of the week does not affect the number of accidents
  • Hypothesis 4: Speed limits do not influence the number of accidents
  • Hypothesis 5: Biological factors like gender and age do not influence the number of accidents
  • Hypothesis 6: Location on the road or vehicle manoeuvre does not influence the number of accidents
  • Hypothesis 7: Poor weather conditions influence the number of accidents

For each hypotheses we created a number of visualisations to display the data in an easier to analyse format which helped us understand the information required.

Example Plots

Here are 2 examples plots we created from the data.
The plots can be found in the /Images folder after running the code in the Notebook files that are in the root directory.

seaborn_chart

Findings Reports and Presentation

The findings of this project can be found in the /Presentation directory.

There are 3 files:

  • 01_Project_scope_notes.pdf
  • 02_Presentation.pdf
  • 02_Traffic Accidents Report.pdf

Dependencies and Setup Required

In order to run the files you will need to install the following packages.

  • gmaps pip install gmaps
  • pandas pip install pandas
  • seaborn pip install seaborn
  • matplotlib pip install matplotlib
  • scipy pip install scipy
  • jupyter notebook pip install notebook

Other Required Files:

Add the below 2 files into your local cloned repository!

The all.CSV must be placed in the "/Resources" directory. Resources_folder

Gmaps API Key requirement

For gmaps you will also need an API key from the Google Maps Platform. Please visit the Google maps platform to set up an API key if you do not already have one.

  1. File 2: config.py - Click to Download

  2. Open the file in a text editor or VS code and change "YOUR API KEY HERE" to your API key from the Google Maps API. api

  3. The config.py file should be stored in your local repository root folder. config

How to View / Run the Code

Made withJupyter

The work was completed primarily using Jupyter Notebooks and the modules listed in the Dependencies section.

  1. Clone the repository

  2. Complete steps in the Dependencies and Setup Required section above.

  3. Open any of the Jupyter Notebook files (.ipynb) in the root directory and run the cells in order.

The Jupyter notebook files have comments in the code and Markdown cells beneath each step explaining what was done in the cell above.

For a short description of what each notebook contains, please see the Jupyter Notebooks File Guide section below.

Jupyter Notebooks File Guide

  • 01_data_retrieval_step_1.ipynb - Initial data processing and filtering
  • 01_data_retrieval_step_2.ipynb - Initial data processing and filtering
  • 01_data_retrieval_step_3.ipynb - Initial data processing and filtering
  • 02_traffic_vol_vs_accidents.ipynb - Volume of Traffic vs Number of Accidents
  • 03_When_accidents_happen.ipynb - Days of the week, Time of day, Gender, Age, vs Accidents
  • 04_Where - Heatmaps.ipynb - Google heatmaps of accidents across the UK and accidents in Birmingham
  • 05_RoadSafety.ipynb - Number of Casualties vs Speed limit and Number of Casualties vs Time of day
  • 06_Speed Limit Project - FINAL.ipynb - Number of accidents vs speed limit and Number of Accidents vs Vehicle Manouvre.
  • 07_Accidents By Road Class and Road Type - Number of Accidents by Severity for Road Class and Road Type
  • 08_weather.ipynb - Number of Accidents vs Weather Condition

Repository Structure

  • Notebook code files in the root directory root/
  • Presentation and report files in the Presentation directory Presentation/
  • Image and Plots in the Images directory Images
  • Dataset files in the Resources directory Resources

Credits / Collaborators / Team

About

Collaborative project exploring UK road accident and vehicle data to determine which factors contribute to accident risk. Used libraries such as seaborn and matplotlib to visualise and analyse data from 2 CSV's imported into jupyter notebook.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%