An advanced data science project aimed at analyzing SpaceX launch data, identifying key insights, and applying machine learning to predict launch success outcomes.
- Introduction
- Project Structure
- EDA and Insights
- Predictive Analysis
- Model Performance
- Conclusion
- Technologies
The commercialization of space travel has revolutionized space exploration. SpaceX is at the forefront of this revolution, consistently launching reusable rockets, reducing launch costs, and increasing the success rate of missions. This project focuses on analyzing SpaceX Falcon 9 launch data and predicting launch outcomes using various machine learning models.
In this project, we:
- Investigate relationships between launch sites, payload mass, and booster versions to success rates.
- Apply predictive models like SVM, Logistic Regression, and Decision Trees for launch outcome predictions.
- Tune hyperparameters for optimized performance.
βββ modules/ # Jupyter notebooks for EDA, modeling, and visualizations
βββ reports/ # Generated reports and project documentation
βββ images/ # Images and figures used in the report and README
βββ README.md # Project overview and documentation
- Highest Success Rate Launch Site: π KSC-LC-39A with a 76.9% success rate.
- Optimal Payload Range for Success: π¦ 2,000kg - 6,000kg has the highest success.
- Lowest Success Rate Payload: βοΈ 7,000kg - 10,000kg range experiences lower success.
- Most Reliable Booster Version: π§ FT version has demonstrated the highest reliability.
EDA was performed using Pandas, Matplotlib, and Seaborn to visualize these insights. Additionally, we created interactive dashboards using Plotly and Dash to dynamically explore the data.
We explored different classification models to predict launch outcomes. Our process included:
-
Data Preprocessing:
β Standardization of features (payload mass, booster version, etc.).
β Train-test split (80%-20%) to validate model performance. -
Model Development:
β Support Vector Machines (SVM)
β Decision Trees
β Logistic Regression
β K Nearest Neighbors (KNN) -
Hyperparameter Tuning:
We performed a grid search over key parameters for each model to ensure optimal performance. -
Evaluation:
Models were evaluated using accuracy on the train set. Cross-validation was employed to mitigate overfitting.
Model | Accuracy |
---|---|
Support Vector Machine (SVM) | 0.848214 |
Decision Tree | 0.876786 |
Logistic Regression | 0.846429 |
K Nearest Neighbors (KNN) | 0.848214 |
Best Performing Model: Decision Tree with 0.876786 accuracy, providing the most consistent and reliable predictions for launch success.
This project successfully demonstrates how exploratory data analysis and machine learning techniques can be used to extract valuable insights from complex datasets and predict future outcomes. By focusing on key features such as launch sites, payload mass, and booster versions, the machine learning models were able to accurately predict the success of SpaceX launches, providing actionable insights for future missions.
- Python π: Core programming language used for data analysis and modeling.
- Pandas, NumPy: Libraries for data manipulation and preprocessing.
- Matplotlib, Seaborn, Plotly: Visualization libraries for EDA and interactive dashboards.
- Scikit-learn: Used for implementing machine learning algorithms.
- Dash: Web application framework for building interactive visualizations.
- Jupyter Notebooks: For developing and documenting the analysis.
Before diving into the labs, itβs highly recommended to review previous concepts and practice them locally on your machine. This not only enhances understanding but also helps reduce the usage of cloud trial versions.
- Plotly - A versatile and powerful library for creating interactive visualizations.
- Folium - Visualize geographical data with ease through interactive maps.
- Folium Examples - Explore practical examples of Folium in action for inspiration and guidance.
- Confusion Matrix - Master the basics of classification evaluation with this simple guide to confusion matrices.
For more information or collaboration, feel free to reach out:
Β© 2024 Angela Paola All Rights Reserved
Click the image to view the full presentation on Canva.