Skip to content

RILUCK/ETA-of-Flight

Repository files navigation

Airline-ETA-prediction-in-Python

Airline ETA Delay Prediction - Expected Time of Arrival with Flight Data

A Binary classification model was developed with Random Forest to predict arrival delays without using departure delay as input features.

Models were developed using the raw data and PCA transformed data. It was observed that the latter gave marginal improvement in performance.

Grid search was used to find the best parameters for model selection.

10-fold cross validation was used to evaluate model performance.

Python Jupyter Notebook is available in the repository and can be run by updating the file paths in Data Set Up section.

Dataset can be downloaded from US Dept of Transport: https://transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time Data range for above tests include: Jan-Apr 2017, May-Jul 2016 and Nov-Dec 2016.

Author: Rishabh Shrivas Updated: November 2019