diff --git a/_freeze/schedule/slides/13-gams-trees/execute-results/html.json b/_freeze/schedule/slides/13-gams-trees/execute-results/html.json index 78b0181..25bc11d 100644 --- a/_freeze/schedule/slides/13-gams-trees/execute-results/html.json +++ b/_freeze/schedule/slides/13-gams-trees/execute-results/html.json @@ -1,7 +1,8 @@ { - "hash": "f4716546d1296f2366a3b91a631c5b80", + "hash": "fce72a85ba37a5798f88dd03851efd87", "result": { - "markdown": "---\nlecture: \"13 GAMs and Trees\"\nformat: revealjs\nmetadata-files: \n - _metadata.yml\n---\n---\n---\n\n## {{< meta lecture >}} {.large background-image=\"gfx/smooths.svg\" background-opacity=\"0.3\"}\n\n[Stat 406]{.secondary}\n\n[{{< meta author >}}]{.secondary}\n\nLast modified -- 09 October 2023\n\n\n\n$$\n\\DeclareMathOperator*{\\argmin}{argmin}\n\\DeclareMathOperator*{\\argmax}{argmax}\n\\DeclareMathOperator*{\\minimize}{minimize}\n\\DeclareMathOperator*{\\maximize}{maximize}\n\\DeclareMathOperator*{\\find}{find}\n\\DeclareMathOperator{\\st}{subject\\,\\,to}\n\\newcommand{\\E}{E}\n\\newcommand{\\Expect}[1]{\\E\\left[ #1 \\right]}\n\\newcommand{\\Var}[1]{\\mathrm{Var}\\left[ #1 \\right]}\n\\newcommand{\\Cov}[2]{\\mathrm{Cov}\\left[#1,\\ #2\\right]}\n\\newcommand{\\given}{\\ \\vert\\ }\n\\newcommand{\\X}{\\mathbf{X}}\n\\newcommand{\\x}{\\mathbf{x}}\n\\newcommand{\\y}{\\mathbf{y}}\n\\newcommand{\\P}{\\mathcal{P}}\n\\newcommand{\\R}{\\mathbb{R}}\n\\newcommand{\\norm}[1]{\\left\\lVert #1 \\right\\rVert}\n\\newcommand{\\snorm}[1]{\\lVert #1 \\rVert}\n\\newcommand{\\tr}[1]{\\mbox{tr}(#1)}\n\\newcommand{\\brt}{\\widehat{\\beta}^R_{s}}\n\\newcommand{\\brl}{\\widehat{\\beta}^R_{\\lambda}}\n\\newcommand{\\bls}{\\widehat{\\beta}_{ols}}\n\\newcommand{\\blt}{\\widehat{\\beta}^L_{s}}\n\\newcommand{\\bll}{\\widehat{\\beta}^L_{\\lambda}}\n$$\n\n\n\n\n\n## GAMs\n\nLast time we discussed smoothing in multiple dimensions.\n\n\nHere we introduce the concept of GAMs ([G]{.secondary}eneralized [A]{.secondary}dditive [M]{.secondary}odels)\n\nThe basic idea is to imagine that the response is the sum of some functions of the predictors:\n\n$$\\Expect{Y \\given X=x} = \\beta_0 + f_1(x_{1})+\\cdots+f_p(x_{p}).$$\n\n\nNote that OLS [is]{.secondary} a GAM (take $f_j(x_{j})=\\beta_j x_{j}$):\n\n$$\\Expect{Y \\given X=x} = \\beta_0 + \\beta_1 x_{1}+\\cdots+\\beta_p x_{p}.$$\n\n## Gams\n\nThese work by estimating each $f_i$ using basis expansions in predictor $i$\n\nThe algorithm for fitting these things is called \"backfitting\" (very similar to the CD intuition for lasso):\n\n\n1. Center $\\y$ and $\\X$.\n2. Hold $f_k$ for all $k\\neq j$ fixed, and regress $\\X_j$ on $(\\y - \\widehat{\\y}_{-j})$ using your favorite smoother.\n3. Repeat for $1\\leq j\\leq p$.\n4. Repeat steps 2 and 3 until the estimated functions \"stop moving\" (iterate)\n5. Return the results.\n\n\n\n## Very small example\n\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(mgcv)\nset.seed(12345)\nn <- 500\nsimple <- tibble(\n x1 = runif(n, 0, 2*pi),\n x2 = runif(n),\n y = 5 + 2 * sin(x1) + 8 * sqrt(x2) + rnorm(n, sd = .25)\n)\n\npivot_longer(simple, -y, names_to = \"predictor\", values_to = \"x\") |>\n ggplot(aes(x, y)) +\n geom_point(col = blue) +\n facet_wrap(~predictor, scales = \"free_x\")\n```\n\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/unnamed-chunk-1-1.svg){fig-align='center'}\n:::\n:::\n\n\n## Very small example\n\nSmooth each coordinate independently\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nex_smooth <- gam(y ~ s(x1) + s(x2), data = simple)\n# s(z) means \"smooth\" z, uses spline basis for each with ridge penalty, GCV\nplot(ex_smooth, pages = 1, scale = 0, shade = TRUE, \n resid = TRUE, se = 2, las = 1)\n```\n\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/gam-mod-1.svg){fig-align='center'}\n:::\n\n```{.r .cell-code}\nhead(coef(ex_smooth))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n(Intercept) s(x1).1 s(x1).2 s(x1).3 s(x1).4 s(x1).5 \n 10.2070490 -4.5764100 0.7117161 0.4548928 0.5535001 -0.2092996 \n```\n:::\n\n```{.r .cell-code}\nex_smooth$gcv.ubre\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n GCV.Cp \n0.06619721 \n```\n:::\n:::\n\n\n\n## Wherefore GAMs?\n\n\nIf \n\n$\\Expect{Y \\given X=x} = \\beta_0 + f_1(x_{1})+\\cdots+f_p(x_{p}),$\n\nthen\n\n$\\textrm{MSE}(\\hat f) = \\frac{Cp}{n^{4/5}} + \\sigma^2.$\n\n* Exponent no longer depends on $p$. Converges faster. (If the truth is additive.)\n\n* You could also use the same methods to include \"some\" interactions like\n\n$$\\begin{aligned}&\\Expect{Y \\given X=x}\\\\ &= \\beta_0 + f_{12}(x_{1},\\ x_{2})+f_3(x_3)+\\cdots+f_p(x_{p}),\\end{aligned}$$\n\n## Very small example\n\nSmooth two coordinates together\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nex_smooth2 <- gam(y ~ s(x1, x2), data = simple)\nplot(ex_smooth2,\n scheme = 2, scale = 0, shade = TRUE,\n resid = TRUE, se = 2, las = 1\n)\n```\n\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/unnamed-chunk-2-1.svg){fig-align='center'}\n:::\n:::\n\n\n\n\n## Regression trees\n\nTrees involve stratifying or segmenting the predictor space into a number of simple regions.\n\nTrees are simple and useful for interpretation. \n\nBasic trees are not great at prediction. \n\nModern methods that use trees are much better (Module 4)\n\n## Regression trees\n\nRegression trees estimate piece-wise constant functions\n\nThe slabs are axis-parallel rectangles $R_1,\\ldots,R_K$ based on $\\X$\n\nIn each region, we average the $y_i$'s: $\\hat\\mu_1,\\ldots,\\hat\\mu_k$\n\nMinimize $\\sum_{k=1}^K \\sum_{i=1}^n (y_i-\\mu_k)^2$ over $R_k,\\mu_k$ for $k\\in \\{1,\\ldots,K\\}$\n\n. . .\n\nThis sounds more complicated than it is.\n\nThe minimization is performed __greedily__ (like forward stepwise regression).\n\n\n\n##\n\n\n![](https://www.aafp.org/dam/AAFP/images/journals/blogs/inpractice/covid_dx_algorithm4.png)\n\n\n\n## Mobility data\n\n\n::: {.cell layout-align=\"center\"}\n\n:::\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nbigtree <- tree(Mobility ~ ., data = mob)\nsmalltree <- prune.tree(bigtree, k = .09)\ndraw.tree(smalltree, digits = 2)\n```\n\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/unnamed-chunk-3-1.svg){fig-align='center'}\n:::\n:::\n\n\nThis is called the [dendrogram]{.secondary}\n\n\n## Partition view\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nmob$preds <- predict(smalltree)\npar(mfrow = c(1, 2), mar = c(5, 3, 0, 0))\ndraw.tree(smalltree, digits = 2)\ncols <- viridisLite::viridis(20, direction = -1)[cut(log(mob$Mobility), 20)]\nplot(mob$Black, mob$Commute,\n pch = 19, cex = .4, bty = \"n\", las = 1, col = cols,\n ylab = \"Commute time\", xlab = \"% Black\"\n)\npartition.tree(smalltree, add = TRUE, ordvars = c(\"Black\", \"Commute\"))\n```\n\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/partition-view-1.svg){fig-align='center'}\n:::\n:::\n\n\n\nWe predict all observations in a region with the same value. \n$\\bullet$ The three regions correspond to the leaves of the tree.\n\n\n##\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\ndraw.tree(bigtree, digits = 2)\n```\n\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/big-tree-1.svg){fig-align='center'}\n:::\n:::\n\n\n\n[Terminology]{.secondary}\n\nWe call each split or end point a node. Each terminal node is referred to as a leaf. \n\nThe interior nodes lead to branches. \n\n\n## Advantages and disadvantages of trees\n\nšŸŽ‰ Trees are very easy to explain (much easier than even linear regression). \n\nšŸŽ‰ Some people believe that decision trees mirror human decision. \n\nšŸŽ‰ Trees can easily be displayed graphically no matter the dimension of the data.\n\nšŸŽ‰ Trees can easily handle qualitative predictors without the need to create dummy variables.\n\nšŸ’© Trees aren't very good at prediction.\n\nšŸ’© Full trees badly overfit, so we \"prune\" them using CV \n\n. . .\n\n[We'll talk more about trees next module for Classification.]{.hand}\n\n# Next time ... {background-image=\"gfx/proforhobo.png\" background-opacity=.3}\n\n\nModule 3\n\nClassification\n\n", + "engine": "knitr", + "markdown": "---\nlecture: \"13 GAMs and Trees\"\nformat: revealjs\nmetadata-files: \n - _metadata.yml\n---\n\n\n## {{< meta lecture >}} {.large background-image=\"gfx/smooths.svg\" background-opacity=\"0.3\"}\n\n[Stat 406]{.secondary}\n\n[{{< meta author >}}]{.secondary}\n\nLast modified -- 08 October 2024\n\n\n\n\n\n$$\n\\DeclareMathOperator*{\\argmin}{argmin}\n\\DeclareMathOperator*{\\argmax}{argmax}\n\\DeclareMathOperator*{\\minimize}{minimize}\n\\DeclareMathOperator*{\\maximize}{maximize}\n\\DeclareMathOperator*{\\find}{find}\n\\DeclareMathOperator{\\st}{subject\\,\\,to}\n\\newcommand{\\E}{E}\n\\newcommand{\\Expect}[1]{\\E\\left[ #1 \\right]}\n\\newcommand{\\Var}[1]{\\mathrm{Var}\\left[ #1 \\right]}\n\\newcommand{\\Cov}[2]{\\mathrm{Cov}\\left[#1,\\ #2\\right]}\n\\newcommand{\\given}{\\ \\vert\\ }\n\\newcommand{\\X}{\\mathbf{X}}\n\\newcommand{\\x}{\\mathbf{x}}\n\\newcommand{\\y}{\\mathbf{y}}\n\\newcommand{\\P}{\\mathcal{P}}\n\\newcommand{\\R}{\\mathbb{R}}\n\\newcommand{\\norm}[1]{\\left\\lVert #1 \\right\\rVert}\n\\newcommand{\\snorm}[1]{\\lVert #1 \\rVert}\n\\newcommand{\\tr}[1]{\\mbox{tr}(#1)}\n\\newcommand{\\brt}{\\widehat{\\beta}^R_{s}}\n\\newcommand{\\brl}{\\widehat{\\beta}^R_{\\lambda}}\n\\newcommand{\\bls}{\\widehat{\\beta}_{ols}}\n\\newcommand{\\blt}{\\widehat{\\beta}^L_{s}}\n\\newcommand{\\bll}{\\widehat{\\beta}^L_{\\lambda}}\n\\newcommand{\\U}{\\mathbf{U}}\n\\newcommand{\\D}{\\mathbf{D}}\n\\newcommand{\\V}{\\mathbf{V}}\n$$\n\n\n\n\n\n## GAMs\n\nLast time we discussed smoothing in multiple dimensions.\n\n\nHere we introduce the concept of GAMs ([G]{.secondary}eneralized [A]{.secondary}dditive [M]{.secondary}odels)\n\nThe basic idea is to imagine that the response is the sum of some functions of the predictors:\n\n$$\\Expect{Y \\given X=x} = \\beta_0 + f_1(x_{1})+\\cdots+f_p(x_{p}).$$\n\n\nNote that OLS [is]{.secondary} a GAM (take $f_j(x_{j})=\\beta_j x_{j}$):\n\n$$\\Expect{Y \\given X=x} = \\beta_0 + \\beta_1 x_{1}+\\cdots+\\beta_p x_{p}.$$\n\n## Gams\n\nThese work by estimating each $f_i$ using basis expansions in predictor $i$\n\nThe algorithm for fitting these things is called \"backfitting\" (very similar to the CD intuition for lasso):\n\n\n1. Center $\\y$ and $\\X$.\n2. Hold $f_k$ for all $k\\neq j$ fixed, and regress $\\X_j$ on $(\\y - \\widehat{\\y}_{-j})$ using your favorite smoother.\n3. Repeat for $1\\leq j\\leq p$.\n4. Repeat steps 2 and 3 until the estimated functions \"stop moving\" (iterate)\n5. Return the results.\n\n\n\n## Very small example\n\n\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(mgcv)\nset.seed(12345)\nn <- 500\nsimple <- tibble(\n x1 = runif(n, 0, 2*pi),\n x2 = runif(n),\n y = 5 + 2 * sin(x1) + 8 * sqrt(x2) + rnorm(n, sd = .25)\n)\n\npivot_longer(simple, -y, names_to = \"predictor\", values_to = \"x\") |>\n ggplot(aes(x, y)) +\n geom_point(col = blue) +\n facet_wrap(~predictor, scales = \"free_x\")\n```\n\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/unnamed-chunk-1-1.svg){fig-align='center'}\n:::\n:::\n\n\n\n## Very small example\n\nSmooth each coordinate independently\n\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nex_smooth <- gam(y ~ s(x1) + s(x2), data = simple)\n# s(z) means \"smooth\" z, uses spline basis for each with ridge penalty, GCV\nplot(ex_smooth, pages = 1, scale = 0, shade = TRUE, \n resid = TRUE, se = 2, las = 1)\n```\n\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/gam-mod-1.svg){fig-align='center'}\n:::\n\n```{.r .cell-code}\nhead(coef(ex_smooth))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n(Intercept) s(x1).1 s(x1).2 s(x1).3 s(x1).4 s(x1).5 \n 10.2070490 -4.5764100 0.7117161 0.4548928 0.5535001 -0.2092996 \n```\n\n\n:::\n\n```{.r .cell-code}\nex_smooth$gcv.ubre\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n GCV.Cp \n0.06619721 \n```\n\n\n:::\n:::\n\n\n\n\n## Wherefore GAMs?\n\n\nIf \n\n$\\Expect{Y \\given X=x} = \\beta_0 + f_1(x_{1})+\\cdots+f_p(x_{p}),$\n\nthen\n\n$$\nR_n^{(\\mathrm{GAM})} =\n \\underbrace{\\frac{C_1^{(\\mathrm{GAM})}}{n^{4/5}}}_{\\mathrm{bias}^2} +\n \\underbrace{\\frac{C_2^{(\\mathrm{GAM})}}{n^{4/5}}}_{\\mathrm{var}} +\n \\sigma^2.\n$$\nCompare with OLS and non-additive local smoothers:\n\n$$\nR_n^{(\\mathrm{OLS})} =\n \\underbrace{C_1^{(\\mathrm{OLS})}}_{\\mathrm{bias}^2} +\n \\underbrace{\\tfrac{C_2^{(\\mathrm{OLS})}}{n/p}}_{\\mathrm{var}} +\n \\sigma^2,\n\\qquad\nR_n^{(\\mathrm{local})} =\n \\underbrace{\\tfrac{C_1^{(\\mathrm{local})}}{n^{4/(4+p)}}}_{\\mathrm{bias}^2} +\n \\underbrace{\\tfrac{C_2^{(\\mathrm{local})}}{n^{4/(4+p)}}}_{\\mathrm{var}} +\n \\sigma^2.\n$$\n\n---\n\n* We no longer have an exponential dependence on $p$!\n\n* But our predictor is restrictive to functions that decompose additively.\n (This is a big limitation.)\n\n* You could also use the same methods to include \"some\" interactions like\n\n$$\\begin{aligned}&\\Expect{Y \\given X=x}\\\\ &= \\beta_0 + f_{12}(x_{1},\\ x_{2})+f_3(x_3)+\\cdots+f_p(x_{p}),\\end{aligned}$$\n\n## Very small example\n\nSmooth two coordinates together\n\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nex_smooth2 <- gam(y ~ s(x1, x2), data = simple)\nplot(ex_smooth2,\n scheme = 2, scale = 0, shade = TRUE,\n resid = TRUE, se = 2, las = 1\n)\n```\n\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/unnamed-chunk-2-1.svg){fig-align='center'}\n:::\n:::\n\n\n\n\n\n## Regression trees\n\n* Trees involve stratifying or segmenting the predictor space into a number of simple regions.\n* Trees are simple and useful for interpretation. \n* Basic trees are not great at prediction. \n* Modern methods that use trees are much better (Module 4)\n\n\n## Example with mobility data\n\n::: flex\n::: w-50\n\n\"Small\" tree\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code code-fold=\"true\"}\ndata(\"mobility\", package = \"Stat406\")\nlibrary(tree)\nlibrary(maptree)\nmob <- mobility[complete.cases(mobility), ] %>% dplyr::select(-ID, -Name)\nset.seed(12345)\npar(mar = c(0, 0, 0, 0), oma = c(0, 0, 0, 0))\nbigtree <- tree(Mobility ~ ., data = mob)\nsmalltree <- prune.tree(bigtree, k = .09)\ndraw.tree(smalltree, digits = 2)\n```\n\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/unnamed-chunk-3-1.svg){fig-align='center'}\n:::\n:::\n\n\n:::\n\n::: w-50\n\"Big\" tree\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/big-tree-1.svg){fig-align='center'}\n:::\n:::\n\n\n:::\n:::\n\n[Terminology]{.secondary}\n\n* We call each split or end point a *node*.\n* Each terminal node is referred to as a *leaf*.\n\n## Example with mobility data\n\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code code-fold=\"true\"}\nmob$preds <- predict(smalltree)\npar(mfrow = c(1, 2), mar = c(5, 3, 0, 0))\ndraw.tree(smalltree, digits = 2)\ncols <- viridisLite::viridis(20, direction = -1)[cut(log(mob$Mobility), 20)]\nplot(mob$Black, mob$Commute,\n pch = 19, cex = .4, bty = \"n\", las = 1, col = cols,\n ylab = \"Commute time\", xlab = \"% Black\"\n)\npartition.tree(smalltree, add = TRUE, ordvars = c(\"Black\", \"Commute\"))\n```\n\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/partition-view-1.svg){fig-align='center'}\n:::\n:::\n\n\n\n\n[(The three regions correspond to the leaves of the tree.)]{.small}\n\\\n\n* Trees are *piecewise constant functions*.\\\n [We predict all observations in a region with the same value.]{.small}\n* Prediction regions are axis-parallel rectangles $R_1,\\ldots,R_K$ based on $\\X$\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n## Constructing Trees\n\n::: flex\n::: w-60\n\nIterative algorithm:\n\n* While ($\\mathtt{depth} \\ne \\mathtt{max.depth}$):\n * For each existing region $R_k$\n * For a given *splitting variable* $j$ and *split value* $s$,\n define\n $$\n \\begin{align}\n R_k^> &= \\{x \\in R_k : x^{(j)} > s\\} \\\\\n R_k^< &= \\{x \\in R_k : x^{(j)} > s\\}\n \\end{align}\n $$\n * Choose $j$ and $s$ \n to minimize\n $$|R_k^>| \\cdot \\widehat{Var}(R_k^>) + |R_k^<| \\cdot \\widehat{Var}(R_k^<)$$\n\n:::\n\n::: w-35\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![](13-gams-trees_files/figure-revealjs/unnamed-chunk-4-1.svg){fig-align='center'}\n:::\n:::\n\n\n::: fragment\nThis algorithm is *greedy*, so it doesn't find the optimal tree\\\n[(But it works well!)]{.small}\n\n:::\n:::\n:::\n\n\n## Advantages and disadvantages of trees\n\nšŸŽ‰ Trees are very easy to explain (much easier than even linear regression). \n\nšŸŽ‰ Some people believe that decision trees mirror human decision. \n\nšŸŽ‰ Trees can easily be displayed graphically no matter the dimension of the data.\n\nšŸŽ‰ Trees can easily handle categorical predictors without the need to create one-hot encodings.\n\nšŸŽ‰ *Trees are GREAT for missing data!!!*\n\nšŸ’© Trees aren't very good at prediction.\n\nšŸ’© Big trees badly overfit, so we \"prune\" them using CV \n\n. . .\n\n[We'll talk more about trees next module for Classification.]{.hand}\n\n# Next time ... {background-image=\"gfx/proforhobo.png\" background-opacity=.3}\n\n\nModule 3\n\nClassification\n\n", "supporting": [ "13-gams-trees_files" ], diff --git a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/big-tree-1.svg b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/big-tree-1.svg index 6b6e1ba..f01da27 100644 --- a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/big-tree-1.svg +++ b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/big-tree-1.svg @@ -3,582 +3,579 @@ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - - - - + - + - + - + - + - + - + - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - + + + + + + + + + + + + + + - - - - - - - - - - - - - - + + + + + + + + + + + + + + - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + - - - - - + + + + + - - - - - - + + + + + + - + - - - - - + + + + + - - - - - - + + + + + + - + - - - - - + + + + + - - - - - - + + + + + + - + - - - - - + + + + + - - - - - - + + + + + + - + - - - - - + + + + + - - - - - - + + + + + + - + - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + - - - - - + + + + + - - - - - - + + + + + + - + - - - - - + + + + + - - - - - - + + + + + + - + - - - - - - - - - - - - - - + + + + + + + + + + + + + + - - - - - + + + + + - - - - - + + + + + - + - - - - - + + + + + - - - - - + + + + + - + - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + - - - - - + + + + + - - - - - - + + + + + + - - + + - - - - - + + + + + - - - - - - + + + + + + - - + + diff --git a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/gam-mod-1.svg b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/gam-mod-1.svg index 00fd1f1..090b239 100644 --- a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/gam-mod-1.svg +++ b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/gam-mod-1.svg @@ -3,78 +3,72 @@ - - - - + - + - + - + - + - + - + - + - + - - + + - + - + - + - + - + - + - + - + - + - + - + - - - @@ -89,25 +83,25 @@ - + - + - + - + - + - + - + @@ -116,38 +110,38 @@ - - + + - - + + - + - + - + - - + + - - - - - - - - - - + + + + + + + + + + @@ -1159,34 +1153,34 @@ - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + @@ -1194,35 +1188,35 @@ - - + + - - + + - + - + - - + + - - - - - - - - - - + + + + + + + + + + diff --git a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/partition-view-1.svg b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/partition-view-1.svg index 34aa46c..620f818 100644 --- a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/partition-view-1.svg +++ b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/partition-view-1.svg @@ -1,733 +1,725 @@ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - - - - + - + - - + + - - - - + - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + - - - - - + + + + + - - - - - - - - - + + + + + + + - + - - - + + + - - - - - + + + + + - - - - - - + + + + + + - + - - - + + + - - - - - + + + + + - - - - - - - + + + + + + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - - - - - - - - - + + + + + + + + + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - - - - - - - - + + + + + + + + + + - - - - - - + + + + + + - - - - - - + + + + + + - - - - - - + + + + + + diff --git a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-1-1.svg b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-1-1.svg index d7688ef..fffef89 100644 --- a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-1-1.svg +++ b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-1-1.svg @@ -3,1364 +3,1355 @@ - - - - + - + - + - + - + - + - + - + - + - - - - - - - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - + - + - - + + - + - - + + - - - - + + + + - + - + - + - + - - - - - + + + + + - - - - + + + + - - - - + + + + - - - - + + + + - - - - + + + + - - - - + + + + - + - + - - + + - - + + - - - - + + + + - + - + diff --git a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-2-1.svg b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-2-1.svg index 213a974..c57120c 100644 --- a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-2-1.svg +++ b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-2-1.svg @@ -3,78 +3,69 @@ - + - + - + - + - + - + - + - + - + - - - - - - - + - + - + - + - + - + - + - + - + - - - - + @@ -730,22 +721,22 @@ - + - + - + - + - + - + @@ -753,48 +744,48 @@ - - - + + + - - - + + + - - - + + + - - - + + + - - - - - - - - - - - - - + + + + + + + + + + + + + - - + + - - + + diff --git a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-3-1.svg b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-3-1.svg index 1cb262b..9f633d1 100644 --- a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-3-1.svg +++ b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-3-1.svg @@ -3,187 +3,184 @@ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - - - - + - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + - - - - - + + + + + - - - - - - - + + + + + + + - + - - - + + + - - - - - + + + + + - - - - - - + + + + + + - + - - - + + + - - - - - + + + + + - - - - - - - + + + + + + + - + diff --git a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-4-1.svg b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-4-1.svg new file mode 100644 index 0000000..40864ad --- /dev/null +++ b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-4-1.svg @@ -0,0 +1,628 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-5-1.svg b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-5-1.svg new file mode 100644 index 0000000..40864ad --- /dev/null +++ b/_freeze/schedule/slides/13-gams-trees/figure-revealjs/unnamed-chunk-5-1.svg @@ -0,0 +1,628 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/schedule/slides/13-gams-trees.qmd b/schedule/slides/13-gams-trees.qmd index 8fd5f3d..bea148d 100644 --- a/schedule/slides/13-gams-trees.qmd +++ b/schedule/slides/13-gams-trees.qmd @@ -83,9 +83,32 @@ $\Expect{Y \given X=x} = \beta_0 + f_1(x_{1})+\cdots+f_p(x_{p}),$ then -$\textrm{MSE}(\hat f) = \frac{Cp}{n^{4/5}} + \sigma^2.$ +$$ +R_n^{(\mathrm{GAM})} = + \underbrace{\frac{C_1^{(\mathrm{GAM})}}{n^{4/5}}}_{\mathrm{bias}^2} + + \underbrace{\frac{C_2^{(\mathrm{GAM})}}{n^{4/5}}}_{\mathrm{var}} + + \sigma^2. +$$ +Compare with OLS and non-additive local smoothers: + +$$ +R_n^{(\mathrm{OLS})} = + \underbrace{C_1^{(\mathrm{OLS})}}_{\mathrm{bias}^2} + + \underbrace{\tfrac{C_2^{(\mathrm{OLS})}}{n/p}}_{\mathrm{var}} + + \sigma^2, +\qquad +R_n^{(\mathrm{local})} = + \underbrace{\tfrac{C_1^{(\mathrm{local})}}{n^{4/(4+p)}}}_{\mathrm{bias}^2} + + \underbrace{\tfrac{C_2^{(\mathrm{local})}}{n^{4/(4+p)}}}_{\mathrm{var}} + + \sigma^2. +$$ -* Exponent no longer depends on $p$. Converges faster. (If the truth is additive.) +--- + +* We no longer have an exponential dependence on $p$! + +* But our predictor is restrictive to functions that decompose additively. + (This is a big limitation.) * You could also use the same methods to include "some" interactions like @@ -108,64 +131,53 @@ plot(ex_smooth2, ## Regression trees -Trees involve stratifying or segmenting the predictor space into a number of simple regions. - -Trees are simple and useful for interpretation. - -Basic trees are not great at prediction. - -Modern methods that use trees are much better (Module 4) - -## Regression trees - -Regression trees estimate piece-wise constant functions - -The slabs are axis-parallel rectangles $R_1,\ldots,R_K$ based on $\X$ - -In each region, we average the $y_i$'s: $\hat\mu_1,\ldots,\hat\mu_k$ - -Minimize $\sum_{k=1}^K \sum_{i=1}^n (y_i-\mu_k)^2$ over $R_k,\mu_k$ for $k\in \{1,\ldots,K\}$ - -. . . - -This sounds more complicated than it is. - -The minimization is performed __greedily__ (like forward stepwise regression). +* Trees involve stratifying or segmenting the predictor space into a number of simple regions. +* Trees are simple and useful for interpretation. +* Basic trees are not great at prediction. +* Modern methods that use trees are much better (Module 4) +## Example with mobility data -## +::: flex +::: w-50 - -![](https://www.aafp.org/dam/AAFP/images/journals/blogs/inpractice/covid_dx_algorithm4.png) - - - -## Mobility data - -```{r small-tree-prelim, echo=FALSE} +"Small" tree +```{r} +#| code-fold: true +#| fig-width: 8 data("mobility", package = "Stat406") library(tree) library(maptree) mob <- mobility[complete.cases(mobility), ] %>% dplyr::select(-ID, -Name) set.seed(12345) par(mar = c(0, 0, 0, 0), oma = c(0, 0, 0, 0)) -``` - -```{r} -#| fig-width: 8 bigtree <- tree(Mobility ~ ., data = mob) smalltree <- prune.tree(bigtree, k = .09) draw.tree(smalltree, digits = 2) ``` +::: -This is called the [dendrogram]{.secondary} +::: w-50 +"Big" tree +```{r big-tree, echo=FALSE} +#| fig-width: 8 +#| fig-height: 5 +draw.tree(bigtree, digits = 2) +``` +::: +::: +[Terminology]{.secondary} + +* We call each split or end point a *node*. +* Each terminal node is referred to as a *leaf*. -## Partition view +## Example with mobility data ```{r partition-view} -#| fig-width: 8 +#| code-fold: true +#| fig-width: 10 mob$preds <- predict(smalltree) par(mfrow = c(1, 2), mar = c(5, 3, 0, 0)) draw.tree(smalltree, digits = 2) @@ -178,24 +190,97 @@ partition.tree(smalltree, add = TRUE, ordvars = c("Black", "Commute")) ``` -We predict all observations in a region with the same value. -$\bullet$ The three regions correspond to the leaves of the tree. +[(The three regions correspond to the leaves of the tree.)]{.small} +\ +* Trees are *piecewise constant functions*.\ + [We predict all observations in a region with the same value.]{.small} +* Prediction regions are axis-parallel rectangles $R_1,\ldots,R_K$ based on $\X$ -## -```{r big-tree} -#| fig-width: 8 -#| fig-height: 5 -draw.tree(bigtree, digits = 2) -``` + -[Terminology]{.secondary} -We call each split or end point a node. Each terminal node is referred to as a leaf. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + -The interior nodes lead to branches. + + +## Constructing Trees + +::: flex +::: w-60 + +Iterative algorithm: + +* While ($\mathtt{depth} \ne \mathtt{max.depth}$): + * For each existing region $R_k$ + * For a given *splitting variable* $j$ and *split value* $s$, + define + $$ + \begin{align} + R_k^> &= \{x \in R_k : x^{(j)} > s\} \\ + R_k^< &= \{x \in R_k : x^{(j)} > s\} + \end{align} + $$ + * Choose $j$ and $s$ + to minimize + $$|R_k^>| \cdot \widehat{Var}(R_k^>) + |R_k^<| \cdot \widehat{Var}(R_k^<)$$ + +::: + +::: w-35 +```{r echo=FALSE} +#| fig-width: 5 +#| fig-height: 4 +plot(mob$Black, mob$Commute, + pch = 19, cex = .4, bty = "n", las = 1, col = cols, + ylab = "Commute time", xlab = "% Black" +) +partition.tree(smalltree, add = TRUE, ordvars = c("Black", "Commute")) +``` +::: fragment +This algorithm is *greedy*, so it doesn't find the optimal tree\ +[(But it works well!)]{.small} + +::: +::: +::: ## Advantages and disadvantages of trees @@ -206,11 +291,13 @@ The interior nodes lead to branches. šŸŽ‰ Trees can easily be displayed graphically no matter the dimension of the data. -šŸŽ‰ Trees can easily handle qualitative predictors without the need to create dummy variables. +šŸŽ‰ Trees can easily handle categorical predictors without the need to create one-hot encodings. + +šŸŽ‰ *Trees are GREAT for missing data!!!* šŸ’© Trees aren't very good at prediction. -šŸ’© Full trees badly overfit, so we "prune" them using CV +šŸ’© Big trees badly overfit, so we "prune" them using CV . . .