Predict next-day rain by training classification models on the target variable RainTomorrow.
The Rain in Australia dataset contains about 10 years of daily weather observations from numerous Australian weather stations. Here's a small sample from the dataset:
RainTomorrow is the target variable to predict. It means -- did it rain the next day, Yes or No?
Observations were drawn from numerous weather stations. The daily observations are available from
An example of the latest weather observations in Canberra: and
To extract as much accuracy as possible, we have used several supervised machine learning models shown below.
- Logistic Regression
- Decision Tree
- Random Forest Model
- Exploratory data analysis and visualization
- Splitting a dataset into training, validation & test sets
- Filling/imputing missing values in numeric columns
- Scaling numeric features to a (0,1) range
- Encoding categorical columns as one-hot vectors
- Training a logistic regression model using Scikit-learn
- Evaluating a model using a validation set and test set
- Training and interpreting decision trees
- Training and interpreting random forests
- Overfitting & hyperparameter tuning