Skip to content

RanjeetKumbhar01/TE_IT_ML_ASSIGNMENTS_SPPU

Repository files navigation

ViewCount

TE IT ML ASSIGNMENTS SPPU-2019 PATTERN

Savitribai Phule Pune University, Pune | Third Year Information Technology (2019 Course) | 314448 : Laboratory Practice-I (Machine Learning)

This repository contains a collection of machine learning assignments for the Third Year Information Technology (2019 Course) at Savitribai Phule Pune University, Pune.

Assignment 1 📊 - Exploratory Data Analysis and Metrics

Kaggle: ML Assignment 1 TE IT SPPU

Dataset: Heart Dataset

Perform the following operation on the given dataset:

  A. Find Shape of Data 📏
  B. Find Missing Values ❓
  C. Find data type of each column 📋
  D. Finding out Zero's 0️⃣
  E. Find Mean age of patients 🧑‍⚕️
  F. Now extract only Age, Sex, ChestPain, RestBP, Chol. Randomly divide dataset in training (75%) and testing (25%). 🔄<br>

Through the diagnosis test, I predicted 100 reports as COVID positive, but only 45 of those were actually positive. Total 50 people in my sample were actually COVID positive. Create a confusion matrix based on the above data and find:

  I. Accuracy ✅
  II. Precision ✨
  III. Recall 📢
  IV. F-1 score 📈

Assignment 2 🌡️ - Regression Analysis

Kaggle: ML Assignment 2 TE IT SPPU

Dataset: Temperature Data

Assignment on Regression technique 📈

  A. Apply Linear Regression using a suitable library function and predict the Month-wise temperature. 🌡️
  B. Assess the performance of regression models using MSE, MAE, and R-Square metrics 📊
  C. Visualize a simple regression model 📓

Assignment 3 📚 - Decision Trees and Classification

Kaggle: ML Assignment 3 TE IT SPPU

Dataset: Graduate Admissions

Assignment on Classification technique 📝

Every year many students give the GRE exam to get admission in foreign Universities. The data set contains GRE Scores (out of 340), TOEFL Scores (out of 120), University Rating (out of 5), Statement of Purpose strength (out of 5), Letter of Recommendation strength (out of 5), Undergraduate GPA (out of 10), Research Experience (0=no, 1=yes), Admitted (0=no, 1=yes). Admitted is the target variable.

The counselor of the firm is supposed to check whether the student will get admission or not based on his/her GRE score and Academic Score. So to help the counselor to take appropriate decisions, build a machine learning model classifier using Decision tree to predict whether a student will get admission or not.

  A. Apply Data pre-processing (Label Encoding, Data Transformation….) techniques if necessary.
  B. Perform data-preparation (Train-Test Split) ✂️
  C. Apply Machine Learning Algorithm 🧠
  D. Evaluate Model 📈

Assignment 4 📩 - SMS Spam Detection

Kaggle: ML Assignment 4 TE IT SPPU

Dataset: SMS Spam Collection

Assignment on Improving Performance of Classifier Models 🚀

A SMS unsolicited mail (every now and then known as cell smartphone junk mail) is any junk message delivered to a cellular phone as textual messaging via the Short Message Service (SMS). Use a probabilistic approach (Naive Bayes Classifier / Bayesian Network) to implement SMS Spam Filtering system. SMS messages are categorized as SPAM or HAM using features like the length of the message, word count, unique keywords, etc.

Download Data -Set from : SMS Spam Collection

This dataset is composed of just one text file, where each line has the correct class followed by the raw message.

  A. Apply Data pre-processing (Label Encoding, Data Transformation….) techniques if necessary ✨
  B. Perform data-preparation (Train-Test Split) ✂️
  C. Apply at least two Machine Learning Algorithms and Evaluate Models 🧠
  D. Apply Cross-Validation and Evaluate Models and compare performance 🔄
  E. Apply Hyperparameter tuning and evaluate models and compare performance 📊

Assignment 5 🛍️ - Customer Segmentation

Kaggle: ML Assignment 5 TE IT SPPU

Dataset: Mall Customers

Assignment on Clustering Techniques 🧩

This dataset gives the data of Income and money spent by the customers visiting a Shopping Mall. The data set contains Customer ID, Gender, Age, Annual Income, Spending Score. Therefore, as a mall owner, you need to find the group of people who are the profitable customers for the mall owner. Apply at least two clustering algorithms (based on Spending Score) to find the group of customers.

  A. Apply Data pre-processing (Label Encoding, Data Transformation….) techniques if necessary ✨
  B. Perform data-preparation( Train-Test Split) ✂️
  C. Apply Machine Learning Algorithm 🧠
  D. Evaluate Model 📈
  E. Apply Cross-Validation and Evaluate Model 🔄

Assignment 6 📊 - Market Basket Analysis

Kaggle: ML Assignment 6 TE IT SPPU

Dataset: Market Basket Optimization

Assignment on Association Rule Learning 🧐

This dataset comprises the list of transactions of a retail company over the period of one week. It contains a total of 7501 transaction records where each record consists of the list of items sold in one transaction. Using this record of transactions and items in each transaction, find the association rules between items. There is no header in the dataset, and the first row contains the first transaction, so mentioned header = None here while loading dataset.

  A. Follow the following steps:
  B. Data Preprocessing ✨
  C. Generate the list of transactions from the dataset 📜
  D. Train Apriori algorithm on the dataset 🧠
  E. Visualize the list of rules 📊
  F. Generated rules depend on the values of hyper parameters. By increasing the minimum confidence value and find the rules accordingly 📈

Assignment 7 🧠 - Multilayer Neural Network

Kaggle: ML Assignment 7 TE IT SPPU

Dataset: Pima Indians Diabetes Data

Assignment on Multilayer Neural Network Model 🧠

The dataset has a total of 9 attributes where the last attribute is “Class attribute” having values 0 and 1. (1=”Positive for Diabetes”, 0=”Negative”)

  A. Load the dataset in the program. Define the ANN Model with Keras. Define at least two hidden layers. Specify the ReLU function as the activation function for the hidden layer and Sigmoid for the output layer.

  B. Compile the model with necessary parameters. Set the number of epochs and batch size and fit the model. 📈

  C. Evaluate the performance of the model for different values of epochs and batch sizes 📊

  D. Evaluate model performance using different activation functions Visualize the model using ANN Visualizer 📊

Requirements 🛠️

To run the code in these assignments, you need to have Python installed on your system along with the required libraries and dependencies. Make sure to install the necessary packages mentioned in the assignment files. For Tableau, you will need to have Tableau software installed on your machine.

License 📜

This project is licensed under the GNU GENERAL PUBLIC LICENSE. Feel free to use the code and materials for educational purposes or personal projects.

Contact ✉️

If you have any questions or suggestions, please feel free to contact:

  • Email: Ranjeet - contact [dot] ranjeetkumbhar [at] gmail [dot] com

Feel free to navigate to each assignment's directory for detailed instructions, code, and any additional resources. If you have any questions or need assistance, don't hesitate to reach out. Good luck with your assignments!