Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
Quarto GHA Workflow Runner committed Oct 9, 2024
1 parent 8fa831a commit 754356f
Show file tree
Hide file tree
Showing 13 changed files with 3,927 additions and 2,602 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
eafe6f9e
19c57bc7
196 changes: 155 additions & 41 deletions schedule/slides/13-gams-trees.html
Original file line number Diff line number Diff line change
Expand Up @@ -398,7 +398,7 @@
<h2>13 GAMs and Trees</h2>
<p><span class="secondary">Stat 406</span></p>
<p><span class="secondary">Geoff Pleiss, Trevor Campbell</span></p>
<p>Last modified – 09 October 2023</p>
<p>Last modified – 08 October 2024</p>
<p><span class="math display">\[
\DeclareMathOperator*{\argmin}{argmin}
\DeclareMathOperator*{\argmax}{argmax}
Expand All @@ -424,6 +424,9 @@ <h2>13 GAMs and Trees</h2>
\newcommand{\bls}{\widehat{\beta}_{ols}}
\newcommand{\blt}{\widehat{\beta}^L_{s}}
\newcommand{\bll}{\widehat{\beta}^L_{\lambda}}
\newcommand{\U}{\mathbf{U}}
\newcommand{\D}{\mathbf{D}}
\newcommand{\V}{\mathbf{V}}
\]</span></p>
</section>
<section id="gams" class="slide level2">
Expand Down Expand Up @@ -492,9 +495,29 @@ <h2>Wherefore GAMs?</h2>
<p>If</p>
<p><span class="math inline">\(\Expect{Y \given X=x} = \beta_0 + f_1(x_{1})+\cdots+f_p(x_{p}),\)</span></p>
<p>then</p>
<p><span class="math inline">\(\textrm{MSE}(\hat f) = \frac{Cp}{n^{4/5}} + \sigma^2.\)</span></p>
<p><span class="math display">\[
R_n^{(\mathrm{GAM})} =
\underbrace{\frac{C_1^{(\mathrm{GAM})}}{n^{4/5}}}_{\mathrm{bias}^2} +
\underbrace{\frac{C_2^{(\mathrm{GAM})}}{n^{4/5}}}_{\mathrm{var}} +
\sigma^2.
\]</span> Compare with OLS and non-additive local smoothers:</p>
<p><span class="math display">\[
R_n^{(\mathrm{OLS})} =
\underbrace{C_1^{(\mathrm{OLS})}}_{\mathrm{bias}^2} +
\underbrace{\tfrac{C_2^{(\mathrm{OLS})}}{n/p}}_{\mathrm{var}} +
\sigma^2,
\qquad
R_n^{(\mathrm{local})} =
\underbrace{\tfrac{C_1^{(\mathrm{local})}}{n^{4/(4+p)}}}_{\mathrm{bias}^2} +
\underbrace{\tfrac{C_2^{(\mathrm{local})}}{n^{4/(4+p)}}}_{\mathrm{var}} +
\sigma^2.
\]</span></p>
</section>
<section class="slide level2">

<ul>
<li><p>Exponent no longer depends on <span class="math inline">\(p\)</span>. Converges faster. (If the truth is additive.)</p></li>
<li><p>We no longer have an exponential dependence on <span class="math inline">\(p\)</span>!</p></li>
<li><p>But our predictor is restrictive to functions that decompose additively. (This is a big limitation.)</p></li>
<li><p>You could also use the same methods to include “some” interactions like</p></li>
</ul>
<p><span class="math display">\[\begin{aligned}&amp;\Expect{Y \given X=x}\\ &amp;= \beta_0 + f_{12}(x_{1},\ x_{2})+f_3(x_3)+\cdots+f_p(x_{p}),\end{aligned}\]</span></p>
Expand All @@ -513,39 +536,65 @@ <h2>Very small example</h2>
<img data-src="13-gams-trees_files/figure-revealjs/unnamed-chunk-2-1.svg" class="quarto-figure quarto-figure-center r-stretch"></section>
<section id="regression-trees" class="slide level2">
<h2>Regression trees</h2>
<p>Trees involve stratifying or segmenting the predictor space into a number of simple regions.</p>
<p>Trees are simple and useful for interpretation.</p>
<p>Basic trees are not great at prediction.</p>
<p>Modern methods that use trees are much better (Module 4)</p>
<ul>
<li>Trees involve stratifying or segmenting the predictor space into a number of simple regions.</li>
<li>Trees are simple and useful for interpretation.<br>
</li>
<li>Basic trees are not great at prediction.</li>
<li>Modern methods that use trees are much better (Module 4)</li>
</ul>
</section>
<section id="regression-trees-1" class="slide level2">
<h2>Regression trees</h2>
<p>Regression trees estimate piece-wise constant functions</p>
<p>The slabs are axis-parallel rectangles <span class="math inline">\(R_1,\ldots,R_K\)</span> based on <span class="math inline">\(\X\)</span></p>
<p>In each region, we average the <span class="math inline">\(y_i\)</span>’s: <span class="math inline">\(\hat\mu_1,\ldots,\hat\mu_k\)</span></p>
<p>Minimize <span class="math inline">\(\sum_{k=1}^K \sum_{i=1}^n (y_i-\mu_k)^2\)</span> over <span class="math inline">\(R_k,\mu_k\)</span> for <span class="math inline">\(k\in \{1,\ldots,K\}\)</span></p>
<div class="fragment">
<p>This sounds more complicated than it is.</p>
<p>The minimization is performed <strong>greedily</strong> (like forward stepwise regression).</p>
<section id="example-with-mobility-data" class="slide level2">
<h2>Example with mobility data</h2>
<div class="flex">
<div class="w-50">
<p>“Small” tree</p>
<div class="cell" data-layout-align="center">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb8-1"><a></a><span class="fu">data</span>(<span class="st">"mobility"</span>, <span class="at">package =</span> <span class="st">"Stat406"</span>)</span>
<span id="cb8-2"><a></a><span class="fu">library</span>(tree)</span>
<span id="cb8-3"><a></a><span class="fu">library</span>(maptree)</span>
<span id="cb8-4"><a></a>mob <span class="ot">&lt;-</span> mobility[<span class="fu">complete.cases</span>(mobility), ] <span class="sc">%&gt;%</span> dplyr<span class="sc">::</span><span class="fu">select</span>(<span class="sc">-</span>ID, <span class="sc">-</span>Name)</span>
<span id="cb8-5"><a></a><span class="fu">set.seed</span>(<span class="dv">12345</span>)</span>
<span id="cb8-6"><a></a><span class="fu">par</span>(<span class="at">mar =</span> <span class="fu">c</span>(<span class="dv">0</span>, <span class="dv">0</span>, <span class="dv">0</span>, <span class="dv">0</span>), <span class="at">oma =</span> <span class="fu">c</span>(<span class="dv">0</span>, <span class="dv">0</span>, <span class="dv">0</span>, <span class="dv">0</span>))</span>
<span id="cb8-7"><a></a>bigtree <span class="ot">&lt;-</span> <span class="fu">tree</span>(Mobility <span class="sc">~</span> ., <span class="at">data =</span> mob)</span>
<span id="cb8-8"><a></a>smalltree <span class="ot">&lt;-</span> <span class="fu">prune.tree</span>(bigtree, <span class="at">k =</span> .<span class="dv">09</span>)</span>
<span id="cb8-9"><a></a><span class="fu">draw.tree</span>(smalltree, <span class="at">digits =</span> <span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure>
<p><img data-src="13-gams-trees_files/figure-revealjs/unnamed-chunk-3-1.svg" class="quarto-figure quarto-figure-center"></p>
</figure>
</div>
</section>
<section id="section-1" class="slide level2">
<h2></h2>

<img data-src="https://www.aafp.org/dam/AAFP/images/journals/blogs/inpractice/covid_dx_algorithm4.png" class="r-stretch"></section>
<section id="mobility-data" class="slide level2">
<h2>Mobility data</h2>
</div>
</div>
</div>
<div class="w-50">
<p>“Big” tree</p>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb8-1"><a></a>bigtree <span class="ot">&lt;-</span> <span class="fu">tree</span>(Mobility <span class="sc">~</span> ., <span class="at">data =</span> mob)</span>
<span id="cb8-2"><a></a>smalltree <span class="ot">&lt;-</span> <span class="fu">prune.tree</span>(bigtree, <span class="at">k =</span> .<span class="dv">09</span>)</span>
<span id="cb8-3"><a></a><span class="fu">draw.tree</span>(smalltree, <span class="at">digits =</span> <span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>

<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure>
<p><img data-src="13-gams-trees_files/figure-revealjs/big-tree-1.svg" class="quarto-figure quarto-figure-center"></p>
</figure>
</div>
</div>
</div>
<img data-src="13-gams-trees_files/figure-revealjs/unnamed-chunk-3-1.svg" class="quarto-figure quarto-figure-center r-stretch"><p>This is called the <span class="secondary">dendrogram</span></p>
</div>
</div>
<p><span class="secondary">Terminology</span></p>
<ul>
<li>We call each split or end point a <em>node</em>.</li>
<li>Each terminal node is referred to as a <em>leaf</em>.</li>
</ul>
</section>
<section id="partition-view" class="slide level2">
<h2>Partition view</h2>
<section id="example-with-mobility-data-1" class="slide level2">
<h2>Example with mobility data</h2>
<div class="cell" data-layout-align="center">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb9-1"><a></a>mob<span class="sc">$</span>preds <span class="ot">&lt;-</span> <span class="fu">predict</span>(smalltree)</span>
<span id="cb9-2"><a></a><span class="fu">par</span>(<span class="at">mfrow =</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">2</span>), <span class="at">mar =</span> <span class="fu">c</span>(<span class="dv">5</span>, <span class="dv">3</span>, <span class="dv">0</span>, <span class="dv">0</span>))</span>
<span id="cb9-3"><a></a><span class="fu">draw.tree</span>(smalltree, <span class="at">digits =</span> <span class="dv">2</span>)</span>
Expand All @@ -555,29 +604,94 @@ <h2>Partition view</h2>
<span id="cb9-7"><a></a> <span class="at">ylab =</span> <span class="st">"Commute time"</span>, <span class="at">xlab =</span> <span class="st">"% Black"</span></span>
<span id="cb9-8"><a></a>)</span>
<span id="cb9-9"><a></a><span class="fu">partition.tree</span>(smalltree, <span class="at">add =</span> <span class="cn">TRUE</span>, <span class="at">ordvars =</span> <span class="fu">c</span>(<span class="st">"Black"</span>, <span class="st">"Commute"</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>

</div>
<img data-src="13-gams-trees_files/figure-revealjs/partition-view-1.svg" class="quarto-figure quarto-figure-center r-stretch"><p>We predict all observations in a region with the same value.<br>
<span class="math inline">\(\bullet\)</span> The three regions correspond to the leaves of the tree.</p>
<img data-src="13-gams-trees_files/figure-revealjs/partition-view-1.svg" class="quarto-figure quarto-figure-center r-stretch"><p><span class="small">(The three regions correspond to the leaves of the tree.)</span><br>
</p>
<ul>
<li>Trees are <em>piecewise constant functions</em>.<br>
<span class="small">We predict all observations in a region with the same value.</span></li>
<li>Prediction regions are axis-parallel rectangles <span class="math inline">\(R_1,\ldots,R_K\)</span> based on <span class="math inline">\(\X\)</span></li>
</ul>
<!-- ## -->
<!-- ![](https://www.aafp.org/dam/AAFP/images/journals/blogs/inpractice/covid_dx_algorithm4.png) -->
<!-- ## Dendrogram view -->
<!-- ```{r} -->
<!-- #| code-fold: true -->
<!-- #| fig-width: 8 -->
<!-- data("mobility", package = "Stat406") -->
<!-- library(tree) -->
<!-- library(maptree) -->
<!-- mob <- mobility[complete.cases(mobility), ] %>% dplyr::select(-ID, -Name) -->
<!-- set.seed(12345) -->
<!-- par(mar = c(0, 0, 0, 0), oma = c(0, 0, 0, 0)) -->
<!-- smalltree <- prune.tree(bigtree, k = .09) -->
<!-- draw.tree(smalltree, digits = 2) -->
<!-- ``` -->
<!-- This is called the [dendrogram]{.secondary} -->
<!-- ## Partition view -->
<!-- ```{r partition-view} -->
<!-- #| code-fold: true -->
<!-- #| fig-width: 10 -->
<!-- mob$preds <- predict(smalltree) -->
<!-- par(mfrow = c(1, 2), mar = c(5, 3, 0, 0)) -->
<!-- draw.tree(smalltree, digits = 2) -->
<!-- cols <- viridisLite::viridis(20, direction = -1)[cut(log(mob$Mobility), 20)] -->
<!-- plot(mob$Black, mob$Commute, -->
<!-- pch = 19, cex = .4, bty = "n", las = 1, col = cols, -->
<!-- ylab = "Commute time", xlab = "% Black" -->
<!-- ) -->
<!-- partition.tree(smalltree, add = TRUE, ordvars = c("Black", "Commute")) -->
<!-- ``` -->
</section>
<section id="section-2" class="slide level2">
<h2></h2>
<section id="constructing-trees" class="slide level2">
<h2>Constructing Trees</h2>
<div class="flex">
<div class="w-60">
<p>Iterative algorithm:</p>
<ul>
<li>While (<span class="math inline">\(\mathtt{depth} \ne \mathtt{max.depth}\)</span>):
<ul>
<li>For each existing region <span class="math inline">\(R_k\)</span>
<ul>
<li>For a given <em>splitting variable</em> <span class="math inline">\(j\)</span> and <em>split value</em> <span class="math inline">\(s\)</span>, define <span class="math display">\[
\begin{align}
R_k^&gt; &amp;= \{x \in R_k : x^{(j)} &gt; s\} \\
R_k^&lt; &amp;= \{x \in R_k : x^{(j)} &gt; s\}
\end{align}
\]</span></li>
<li>Choose <span class="math inline">\(j\)</span> and <span class="math inline">\(s\)</span> to minimize <span class="math display">\[|R_k^&gt;| \cdot \widehat{Var}(R_k^&gt;) + |R_k^&lt;| \cdot \widehat{Var}(R_k^&lt;)\]</span></li>
</ul></li>
</ul></li>
</ul>
</div>
<div class="w-35">
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode numberSource r number-lines code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a></a><span class="fu">draw.tree</span>(bigtree, <span class="at">digits =</span> <span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>

<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure>
<p><img data-src="13-gams-trees_files/figure-revealjs/unnamed-chunk-4-1.svg" class="quarto-figure quarto-figure-center"></p>
</figure>
</div>
</div>
</div>
<div class="fragment">
<p>This algorithm is <em>greedy</em>, so it doesn’t find the optimal tree<br>
<span class="small">(But it works well!)</span></p>
</div>
</div>
</div>
<img data-src="13-gams-trees_files/figure-revealjs/big-tree-1.svg" class="quarto-figure quarto-figure-center r-stretch"><p><span class="secondary">Terminology</span></p>
<p>We call each split or end point a node. Each terminal node is referred to as a leaf.</p>
<p>The interior nodes lead to branches.</p>
</section>
<section id="advantages-and-disadvantages-of-trees" class="slide level2">
<h2>Advantages and disadvantages of trees</h2>
<p>🎉 Trees are very easy to explain (much easier than even linear regression).</p>
<p>🎉 Some people believe that decision trees mirror human decision.</p>
<p>🎉 Trees can easily be displayed graphically no matter the dimension of the data.</p>
<p>🎉 Trees can easily handle qualitative predictors without the need to create dummy variables.</p>
<p>🎉 Trees can easily handle categorical predictors without the need to create one-hot encodings.</p>
<p>🎉 <em>Trees are GREAT for missing data!!!</em></p>
<p>💩 Trees aren’t very good at prediction.</p>
<p>💩 Full trees badly overfit, so we “prune” them using CV</p>
<p>💩 Big trees badly overfit, so we “prune” them using CV</p>
<div class="fragment">
<p><span class="hand">We’ll talk more about trees next module for Classification.</span></p>
</div>
Expand Down
Loading

0 comments on commit 754356f

Please sign in to comment.