Skip to content

Commit

Permalink
Merge branch 'main' of github.com:UBC-CS/cpsc330-2024W1
Browse files Browse the repository at this point in the history
  • Loading branch information
firasm committed Sep 10, 2024
2 parents 2202a7e + f7cf7ff commit 4d6565f
Show file tree
Hide file tree
Showing 8 changed files with 7,911 additions and 75 deletions.
14 changes: 14 additions & 0 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,20 @@ jobs:
path: ${{ env.pythonLocation }}
key: ${{ env.pythonLocation }}-${{ hashFiles('setup.py') }}-${{ hashFiles('requirements.txt') }}

- id: install-graphviz-linux
name: Install Graphviz on Linux
# if: runner.os == 'Linux'
# shell: bash
run: |
# Install Graphviz on Linux
sudo apt update
sudo apt install -qy gsfonts
sudo apt -qq list fonts-liberation gsfonts
sudo apt install -qy graphviz
sudo apt -y autoremove --purge
sudo apt -y autoclean
sudo apt clean
- name: Install dependencies
run: |
pip install -r requirements.txt
Expand Down
41 changes: 21 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
[![deploy-book](https://github.com/UBC-CS/cpsc330-2024W1/actions/workflows/deploy.yml/badge.svg?branch=main)](https://github.com/UBC-CS/cpsc330-2024W1/actions/workflows/deploy.yml)

# UBC CPSC 330: Applied Machine Learning (2024W1)

This is the course homepage for CPSC 330: Applied Machine Learning at the University of British Columbia. You are looking at the current version (Sep-Dec 2024).
Expand All @@ -12,25 +14,24 @@ This is the course homepage for CPSC 330: Applied Machine Learning at the Univer
- Devyani McLaren (cpsc330-admin@cs.ubc.ca), please reach out to Devyani for: admin questions, extensions, academic concessions etc.

### TAs
- Akash Adhikary akash7adhikary@gmail.com
- Amirali Goodarzvand Chegini amiralic@student.ubc.ca
- Aryan Ballani aryanballani070603@gmail.com 
- Atabak Eghbal eghbal.atabak@gmail.com
- Derrick Cheng dcheng04@students.cs.ubc.ca
- Frederick Sunstrum fr.sunstrum@gmail.com
- Hongkai Liu liuhongkai112233@163.com
- Noah Marusenko noah.marusenko@icloud.com
- Jialin (Mike) Lu mike020830@gmail.com
- Kimia Rostin krostin@student.ubc.ca
- Mahsa Zarei mahsazarei19@gmail.com
- Mike Ju feng0025@student.ubc.ca
- Mishaal Kazmi mkazmi@cs.ubc.ca
- Rubia Reis Guerra rubiarg@cs.ubc.ca
- Shadab Shaikh shadabs3@cs.ubc.ca
- Sohbat Sandhu sohbat@student.ubc.ca
- Sparsh Trivedy sparsh01@student.ubc.ca
- Stash Currie stashubc@student.ubc.ca
- Tianyu (Niki) Duan dty200@student.ubc.ca
- Akash Adhikary
- Amirali Goodarzvand Chegini
- Aryan Ballani
- Atabak Eghbal
- Derrick Cheng
- Frederick Sunstrum
- Hongkai Liu
- Noah Marusenko
- Jialin (Mike) Lu
- Kimia Rostin
- Mahsa Zarei
- Mike Ju
- Mishaal Kazmi
- Rubia Reis Guerra
- Shadab Shaikh
- Sohbat Sandhu
- Stash Currie
- Tianyu (Niki) Duan

## License

Expand All @@ -57,7 +58,7 @@ Usually the homework assignments will be due on Mondays (except next week) and w
| Assessment | Due date | Where to find? | Where to submit? |
|----------------|-----------------------|------------------------------------------------------------------------------------|-------------------------------------------------------|
| hw1 | Sept 10, 11:59 pm | [GitHub repo](https://github.com/new?template_name=hw1&template_owner=ubc-cpsc330) | [Gradescope](https://www.gradescope.ca/courses/18608) |
| hw2 | Sept 16, 11:59 pm | GitHub repo[](https://github.com/new?template_name=hw2&template_owner=ubc-cpsc330) | [Gradescope](https://www.gradescope.ca/courses/18608) |
| hw2 | Sept 16, 11:59 pm | [GitHub repo](https://github.com/new?template_name=hw2&template_owner=ubc-cpsc330) | [Gradescope](https://www.gradescope.ca/courses/18608) |
| Syllabus quiz | Sept 19, 11:59 pm | [PrairieLearn](https://us.prairielearn.com/pl/course_instance/163899/assessment/2451488) | [PrairieLearn](https://us.prairielearn.com/pl/course_instance/163899/assessment/2451488) |
| hw3 | Oct 01, 11:59 pm | GitHub repo[](https://github.com/new?template_name=hw3&template_owner=ubc-cpsc330) | [Gradescope](https://www.gradescope.ca/courses/18608) |
| hw4 | Oct 07, 11:59 pm | GitHub repo[](https://github.com/new?template_name=hw4&template_owner=ubc-cpsc330) | [Gradescope](https://www.gradescope.ca/courses/18608) |
Expand Down
2 changes: 2 additions & 0 deletions _toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ parts:
- caption: Lectures
chapters:
- file: lectures/notes/01_intro.ipynb
- file: lectures/notes/02_terminology-decision-trees.ipynb
- file: lectures/notes/03_ml-fundamentals.ipynb
- caption: Section slides
chapters:
- file: lectures/101-Giulia-lectures/README
Expand Down
8 changes: 6 additions & 2 deletions lectures/code/plotting_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def plot_tree_decision_boundary(


def plot_tree_decision_boundary_and_tree(
model, X, y, height=6, width=16, x_label="x-axis", y_label="y-axis", eps=None
model, X, y, height=6, width=16, fontsize = 9, x_label="x-axis", y_label="y-axis", eps=None
):
fig, ax = plt.subplots(
1,
Expand All @@ -38,7 +38,11 @@ def plot_tree_decision_boundary_and_tree(
gridspec_kw={"width_ratios": [1.5, 2]},
)
plot_tree_decision_boundary(model, X, y, x_label, y_label, eps, ax=ax[0])
ax[1].imshow(tree_image(X.columns, model))
custom_plot_tree(model,
feature_names=X.columns.tolist(),
class_names=['A+', 'not A+'],
impurity=False,
fontsize=fontsize, ax=ax[1])
ax[1].set_axis_off()
plt.show()

Expand Down
83 changes: 36 additions & 47 deletions lectures/code/utils.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,8 @@
import pandas as pd
import numpy as np
import re
from PIL import Image
from torchvision.models import vgg16
from torchvision import transforms
import torch
import re
from sklearn.model_selection import cross_val_score, cross_validate, train_test_split

from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor, export_graphviz
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor, export_graphviz, plot_tree

import glob

Expand All @@ -18,46 +13,40 @@

plt.rcParams["font.size"] = 16

def display_tree(feature_names, tree, counts=False):
""" For binary classification only """
dot = export_graphviz(
tree,
out_file=None,
feature_names=feature_names,
class_names=tree.classes_.astype(str),
impurity=False,
)
# adapted from https://stackoverflow.com/questions/44821349/python-graphviz-remove-legend-on-nodes-of-decisiontreeclassifier
# dot = re.sub('(\\\\nsamples = [0-9]+)(\\\\nvalue = \[[0-9]+, [0-9]+\])(\\\\nclass = [A-Za-z0-9]+)', '', dot)
if counts:
dot = re.sub("(samples = [0-9]+)\\\\n", "", dot)
dot = re.sub("value", "counts", dot)
else:
dot = re.sub("(\\\\nsamples = [0-9]+)(\\\\nvalue = \[[0-9]+, [0-9]+\])", "", dot)
dot = re.sub("(samples = [0-9]+)(\\\\nvalue = \[[0-9]+, [0-9]+\])\\\\n", "", dot)

return graphviz.Source(dot)


def tree_image(feature_names, tree):
""" For binary classification only """
dot = export_graphviz(
tree,
out_file=None,
feature_names=feature_names,
class_names=tree.classes_.astype(str),
impurity=False,
)
# adapted from https://stackoverflow.com/questions/44821349/python-graphviz-remove-legend-on-nodes-of-decisiontreeclassifier
# dot = re.sub('(\\\\nsamples = [0-9]+)(\\\\nvalue = \[[0-9]+, [0-9]+\])(\\\\nclass = [A-Za-z0-9]+)', '', dot)
#dot = re.sub("(\\\\nsamples = [0-9]+)(\\\\nvalue = \[[0-9]+, [0-9]+\])", "", dot)
#dot = re.sub("(samples = [0-9]+)(\\\\nvalue = \[[0-9]+, [0-9]+\])\\\\n", "", dot)
dot = re.sub("(samples = [0-9]+)\\\\n", "", dot)
dot = re.sub("value", "counts", dot)
graph = graphviz.Source(dot, format="png")
fout = "tmp"
graph.render(fout)
return imread(fout + ".png")
# Custom function to customize the tree plot and hide values and samples
def custom_plot_tree(tree_model, feature_names=None, class_names=None, **kwargs):
"""
Customizes and displays a tree plot for a scikit-learn Decision Tree Classifier.
Parameters:
- tree (sklearn.tree.DecisionTreeClassifier): The trained Decision Tree Classifier to visualize.
- width: width of the matplotlib plot in inches
- height: height of the matplotlib plot in inches
- feature_names (list or None): A list of feature names to label the tree nodes with feature names.
If None, generic feature names will be used.
- class_names (list or None): A list of class names to label the tree nodes with class names.
If None, generic class names will be used.
- **kwargs: Additional keyword arguments to be passed to the `sklearn.tree.plot_tree` function.
Returns:
- None: The function displays the customized tree plot using Matplotlib.
This function customizes the appearance of a Decision Tree plot generated by the scikit-learn
`plot_tree` function. It hides both the samples and values in each node of the tree plot
for improved visualization.
"""
plot_tree(tree_model,
feature_names=feature_names,
class_names=class_names,
filled=True,
**kwargs)

# Customize the appearance of the text elements for each node
for text in plt.gca().texts:
new_text = re.sub('samples = \d+\n', '', text.get_text()) # Hide samples
text.set_text(new_text)

plt.show()

def cross_validate_std(*args, **kwargs):
"""Like cross_validate, except also gives the standard deviation of the score"""
Expand Down
Loading

0 comments on commit 4d6565f

Please sign in to comment.