This code demonstrates a machine learning pipeline for image classification using the Support Vector Machine (SVM) algorithm. It trains a model on a dataset of labeled images, extracts features from the images, and uses the trained model to predict labels for new test images. In this case we used the datasets to differentiate between dress shirts and t-shirts.
Setup and Dependencies:
- Python 3.x
- OpenCV (
cv2
) - Pandas (
pandas
) - NumPy (
numpy
) - Matplotlib (
matplotlib
) - Scikit-learn (
sklearn
) - Scikit-image (
skimage
)
Usage:
-
Import the necessary libraries and modules. The datasets are not provided in this repository.
-
Load the training and test data from CSV files. Ensure that the data is properly formatted, where the training data is stored in
TrainData.csv
, the corresponding labels are stored inTrainLabels.csv
, and the test data is stored inTestData.csv
. -
Preprocess the training data by reshaping it into a 3D array representing images. The original images are 28x28 grayscale, so the data is reshaped to (-1, 28, 28).
-
Display an example image from the training data to visualize the input.
-
Defined a function
extract_features
to extract features from images. By default, the Histogram of Oriented Gradients (HOG) method is used. You can implement additional methods like edge extraction, color channel extraction, or midpoint extraction if needed. -
Defined a function
extract_pixel_intensity
to extract features by flattening the image pixels. -
Called the
extract_features
function to extract features from the training and test sets. -
Set the hyperparameters for the SVM model, such as the regularization parameter
C
and the kernel typekernel
. -
Train the SVM model using the entire training dataset.
-
Save the trained model to a file named 'final_model.pkl' using the
pickle
module for future use. -
Generate predictions for the test examples using the trained model.
-
Save the predictions to a CSV file named 'myPredictions.csv', which contains the predicted labels for the test examples. The files 'myPredictions.csv' and 'final_model.pkl' files are provided for reference
Note: You can modify the code to use different feature extraction methods or adjust the SVM hyperparameters to improve classification accuracy.