Ananlysis of the NOAA reef bleaching dataset to check coral reefs' bleaching around the world.
NOAA (National Oceanic and Atmospheric Administration) understands and predicts changes in climate, weather, oceans, and coasts. It shares this with others to help conserve and manage coastal and marine ecosystems and resources.
Contents
🤔
Coral reefs are an integral part of the ecosystem for life underwater. They protect coastal areas and provide an income source to millions of people. But over the past few decades they have been affected by industrialization and other human-induced factors. This has resulted in coral reef bleaching in various oceans which in turn is reducing their growth rates and making them susceptible to diseases.
🎯
The aim of coral reefs' bleaching analysis is to identify the prime factors which affect reef bleaching in different areas and to further understand how gravely each of these factors causes bleaching in different oceans.
📊
The dataset contains the following attributes.
- Bleaching
- Ocean
- Year
- Depth
- Storms
- Human Impact
- Siltation
- Dynamite
- Poison
- Sewage
- Industrial
- Commercial
🎞️
The dataset will be analyzed in the following manner.
📚
numpy
adds support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
matplotlib
is a plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits.
scipy
is used for scientific computing and technical computing. It contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks.
pandas
is for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series.
sklearn
or scikit-learn features various classification, regression and clustering algorithms and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
seaborn
is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
👀
The data has been analyzed using the following 3 methods:
🤖
The output is, given certain factors (which may cause reef bleaching), whether or not the coral reef has actually been bleached.
The model has been trained using Logistic Regression algorithm.
The target variable is
Bleaching
.
💯
The accuracy of the output prediction is greater than 96%.
The model's accuracy is being evaluated using K-fold cross-validation.
The evaluation metrics include RMSE(Root Mean Square Error) and R2 score.
🔖
The following links were used to style this README -
👋🏻 Hi! Thanks for stopping by. Give it a ⭐ if you like it!