Skip to content

Commit

Permalink
minor eds to lda/qda
Browse files Browse the repository at this point in the history
  • Loading branch information
trevorcampbell committed Oct 14, 2024
1 parent c54b9e4 commit ea1b16d
Show file tree
Hide file tree
Showing 7 changed files with 402 additions and 406 deletions.
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
{
"hash": "783e52d7187096a8e0bcf81cd68978d3",
"hash": "8a84974e72316aa8240202b718f65637",
"result": {
"markdown": "---\nlecture: \"15 LDA and QDA\"\nformat: revealjs\nmetadata-files: \n - _metadata.yml\n---\n---\n---\n\n## {{< meta lecture >}} {.large background-image=\"gfx/smooths.svg\" background-opacity=\"0.3\"}\n\n[Stat 406]{.secondary}\n\n[{{< meta author >}}]{.secondary}\n\nLast modified -- 09 October 2023\n\n\n\n$$\n\\DeclareMathOperator*{\\argmin}{argmin}\n\\DeclareMathOperator*{\\argmax}{argmax}\n\\DeclareMathOperator*{\\minimize}{minimize}\n\\DeclareMathOperator*{\\maximize}{maximize}\n\\DeclareMathOperator*{\\find}{find}\n\\DeclareMathOperator{\\st}{subject\\,\\,to}\n\\newcommand{\\E}{E}\n\\newcommand{\\Expect}[1]{\\E\\left[ #1 \\right]}\n\\newcommand{\\Var}[1]{\\mathrm{Var}\\left[ #1 \\right]}\n\\newcommand{\\Cov}[2]{\\mathrm{Cov}\\left[#1,\\ #2\\right]}\n\\newcommand{\\given}{\\ \\vert\\ }\n\\newcommand{\\X}{\\mathbf{X}}\n\\newcommand{\\x}{\\mathbf{x}}\n\\newcommand{\\y}{\\mathbf{y}}\n\\newcommand{\\P}{\\mathcal{P}}\n\\newcommand{\\R}{\\mathbb{R}}\n\\newcommand{\\norm}[1]{\\left\\lVert #1 \\right\\rVert}\n\\newcommand{\\snorm}[1]{\\lVert #1 \\rVert}\n\\newcommand{\\tr}[1]{\\mbox{tr}(#1)}\n\\newcommand{\\brt}{\\widehat{\\beta}^R_{s}}\n\\newcommand{\\brl}{\\widehat{\\beta}^R_{\\lambda}}\n\\newcommand{\\bls}{\\widehat{\\beta}_{ols}}\n\\newcommand{\\blt}{\\widehat{\\beta}^L_{s}}\n\\newcommand{\\bll}{\\widehat{\\beta}^L_{\\lambda}}\n$$\n\n\n\n\n\n## Last time\n\n\nWe showed that with two classes, the [Bayes' classifier]{.secondary} is\n\n$$g_*(X) = \\begin{cases}\n1 & \\textrm{ if } \\frac{p_1(X)}{p_0(X)} > \\frac{1-\\pi}{\\pi} \\\\\n0 & \\textrm{ otherwise}\n\\end{cases}$$\n\nwhere $p_1(X) = Pr(X \\given Y=1)$, $p_0(X) = Pr(X \\given Y=0)$ and $\\pi = Pr(Y=1)$\n\n. . .\n\nFor more than two classes.\n\n$$g_*(X) = \n\\argmax_k \\frac{\\pi_k p_k(X)}{\\sum_k \\pi_k p_k(X)}$$\n\nwhere $p_k(X) = Pr(X \\given Y=k)$ and $\\pi_k = P(Y=k)$\n\n\n## Estimating these\n \nLet's make some assumptions:\n\n1. $Pr(X\\given Y=k) = \\mbox{N}(\\mu_k,\\Sigma_k)$\n2. $\\Sigma_k = \\Sigma_{k'} = \\Sigma$\n\n. . .\n\nThis leads to [Linear Discriminant Analysis]{.secondary} (LDA), one of the oldest classifiers\n\n\n\n## LDA\n\n\n1. Split your training data into $K$ subsets based on $y_i=k$.\n2. In each subset, estimate the mean of $X$: $\\widehat\\mu_k = \\overline{X}_k$\n3. Estimate the pooled variance: $$\\widehat\\Sigma = \\frac{1}{n-K} \\sum_{k \\in \\mathcal{K}} \\sum_{i \\in k} (x_i - \\overline{X}_k) (x_i - \\overline{X}_k)^{\\top}$$\n4. Estimate the class proportion: $\\widehat\\pi_k = n_k/n$\n\n## LDA\n\nAssume just $K = 2$ so $k \\in \\{0,\\ 1\\}$\n\nWe predict $\\widehat{y} = 1$ if\n\n$$\\widehat{p_1}(x) / \\widehat{p_0}(x) > \\widehat{\\pi_0} / \\widehat{\\pi_1}$$ \n\nPlug in the density estimates:\n\n$$\\widehat{p_k}(x) = N(x - \\widehat{\\mu}_k,\\ \\widehat\\Sigma)$$\n\n\n## LDA\n\n\nNow we take $\\log$ and simplify $(K=2)$:\n\n$$\n\\begin{aligned}\n&\\Rightarrow \\log(\\widehat{p_1}(x)\\times\\widehat{\\pi_1}) - \\log(\\widehat{p_0}(x)\\times\\widehat{\\pi_0})\n= \\cdots = \\cdots\\\\\n&= \\underbrace{\\left(x^\\top\\widehat\\Sigma^{-1}\\overline X_1-\\frac{1}{2}\\overline X_1^\\top \\widehat\\Sigma^{-1}\\overline X_1 + \\log \\widehat\\pi_1\\right)}_{\\delta_1(x)} - \\underbrace{\\left(x^\\top\\widehat\\Sigma^{-1}\\overline X_0-\\frac{1}{2}\\overline X_0^\\top \\widehat\\Sigma^{-1}\\overline X_0 + \\log \\widehat\\pi_0\\right)}_{\\delta_0(x)}\\\\\n&= \\delta_1(x) - \\delta_0(x)\n\\end{aligned}\n$$\n\n\n[If $\\delta_1(x) > \\delta_0(x)$, we set $\\widehat g(x)=1$]{.secondary}\n\n## One dimensional intuition\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nset.seed(406406406)\nn <- 100\npi <- .6\nmu0 <- -1\nmu1 <- 2\nsigma <- 2\ntib <- tibble(\n y = rbinom(n, 1, pi),\n x = rnorm(n, mu0, sigma) * (y == 0) + rnorm(n, mu1, sigma) * (y == 1)\n)\n```\n:::\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code code-fold=\"true\"}\ngg <- ggplot(tib, aes(x, y)) +\n geom_point(colour = blue) +\n stat_function(fun = ~ 6 * (1 - pi) * dnorm(.x, mu0, sigma), colour = orange) +\n stat_function(fun = ~ 6 * pi * dnorm(.x, mu1, sigma), colour = orange) +\n annotate(\"label\",\n x = c(-3, 4.5), y = c(.5, 2 / 3),\n label = c(\"(1-pi)*p[0](x)\", \"pi*p[1](x)\"), parse = TRUE\n )\ngg\n```\n\n::: {.cell-output-display}\n![](15-LDA-and-QDA_files/figure-revealjs/unnamed-chunk-2-1.svg){fig-align='center'}\n:::\n:::\n\n\n\n\n## What is linear?\n\nLook closely at the equation for $\\delta_1(x)$:\n\n$$\\delta_1(x)=x^\\top\\widehat\\Sigma^{-1}\\overline X_1-\\frac{1}{2}\\overline X_1^\\top \\widehat\\Sigma^{-1}\\overline X_1 + \\log \\widehat\\pi_1$$\n\nWe can write this as $\\delta_1(x) = x^\\top a_1 + b_1$ with $a_1 = \\widehat\\Sigma^{-1}\\overline X_1$ and $b_1=-\\frac{1}{2}\\overline X_1^\\top \\widehat\\Sigma^{-1}\\overline X_1 + \\log \\widehat\\pi_1$.\n\nWe can do the same for $\\delta_0(x)$ (in terms of $a_0$ and $b_0$)\n\nTherefore, \n\n$$\\delta_1(x)-\\delta_0(x) = x^\\top(a_1-a_0) + (b_1-b_0)$$\n\nThis is how we discriminate between the classes.\n\nWe just calculate $(a_1 - a_0)$ (a vector in $\\R^p$), and $b_1 - b_0$ (a scalar)\n\n\n## Baby example\n\n::: flex\n::: w-50\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(mvtnorm)\nlibrary(MASS)\ngenerate_lda_2d <- function(\n n, p = c(.5, .5), \n mu = matrix(c(0, 0, 1, 1), 2),\n Sigma = diag(2)) {\n X <- rmvnorm(n, sigma = Sigma)\n tibble(\n y = which(rmultinom(n, 1, p) == 1, TRUE)[,1],\n x1 = X[, 1] + mu[1, y],\n x2 = X[, 2] + mu[2, y]\n )\n}\ndat1 <- generate_lda_2d(100, Sigma = .5 * diag(2))\nlda_fit <- lda(y ~ ., dat1)\n```\n:::\n\n\n:::\n::: w-50\n\n\n::: {.cell layout-align=\"center\" dvi='300'}\n::: {.cell-output-display}\n![](15-LDA-and-QDA_files/figure-revealjs/plot-d1-1.png){fig-align='center'}\n:::\n:::\n\n\n:::\n\n:::\n\n\n## Multiple classes\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nmoreclasses <- generate_lda_2d(150, c(.2, .3, .5), matrix(c(0, 0, 1, 1, 1, 0), 2), .5 * diag(2))\nseparateclasses <- generate_lda_2d(150, c(.2, .3, .5), matrix(c(-1, -1, 2, 2, 2, -1), 2), .1 * diag(2))\n```\n:::\n\n::: {.cell layout-align=\"center\" dvi='300'}\n::: {.cell-output-display}\n![](15-LDA-and-QDA_files/figure-revealjs/3class-plot-1.png){fig-align='center'}\n:::\n:::\n\n\n\n\n## QDA\n\nJust like LDA, but $\\Sigma_k$ is separate for each class.\n\nProduces [Quadratic]{.secondary} decision boundary.\n\nEverything else is the same.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nqda_fit <- qda(y ~ ., dat1)\nqda_3fit <- qda(y ~ ., moreclasses)\n```\n:::\n\n::: {.cell layout-align=\"center\" dvi='300'}\n::: {.cell-output-display}\n![](15-LDA-and-QDA_files/figure-revealjs/qda-vs-lda-2class-1.png){fig-align='center'}\n:::\n:::\n\n\n\n## 3 class comparison\n\n\n::: {.cell layout-align=\"center\" dvi='300'}\n::: {.cell-output-display}\n![](15-LDA-and-QDA_files/figure-revealjs/3class-comparison-1.png){fig-align='center'}\n:::\n:::\n\n\n\n## Notes\n\n* LDA is a linear classifier. It is not a linear smoother.\n - It is derived from Bayes rule.\n - Assume each class-conditional density in Gaussian\n - It assumes the classes have different mean vectors, but the same (common) covariance matrix.\n - It estimates densities and probabilities and \"plugs in\" \n\n* QDA is not a linear classifier. It depends on quadratic functions of the data.\n - It is derived from Bayes rule.\n - Assume each class-conditional density in Gaussian\n - It assumes the classes have different mean vectors and different covariance matrices.\n - It estimates densities and probabilities and \"plugs in\" \n \n##\n\n[It is hard (maybe impossible) to come up with reasonable classifiers that are linear smoothers. Many \"look\" like a linear smoother, but then apply a nonlinear transformation.]{.hand}\n\n## Naïve Bayes\n\nAssume that $Pr(X | Y = k) = Pr(X_1 | Y = k)\\cdots Pr(X_p | Y = k)$.\n\nThat is, conditional on the class, the feature distribution is independent.\n\n. . .\n\nIf we further assume that $Pr(X_j | Y = k)$ is Gaussian,\n\nThis is the same as QDA but with $\\Sigma_k$ Diagonal.\n\n. . .\n\nDon't have to assume Gaussian. Could do lots of stuff. \n\n\n# Next time...\n\nAnother linear classifier and transformations\n",
"engine": "knitr",
"markdown": "---\nlecture: \"15 LDA and QDA\"\nformat: revealjs\nmetadata-files: \n - _metadata.yml\n---\n\n\n\n## {{< meta lecture >}} {.large background-image=\"gfx/smooths.svg\" background-opacity=\"0.3\"}\n\n[Stat 406]{.secondary}\n\n[{{< meta author >}}]{.secondary}\n\nLast modified -- 14 October 2024\n\n\n\n\n\n\n\n$$\n\\DeclareMathOperator*{\\argmin}{argmin}\n\\DeclareMathOperator*{\\argmax}{argmax}\n\\DeclareMathOperator*{\\minimize}{minimize}\n\\DeclareMathOperator*{\\maximize}{maximize}\n\\DeclareMathOperator*{\\find}{find}\n\\DeclareMathOperator{\\st}{subject\\,\\,to}\n\\newcommand{\\E}{E}\n\\newcommand{\\Expect}[1]{\\E\\left[ #1 \\right]}\n\\newcommand{\\Var}[1]{\\mathrm{Var}\\left[ #1 \\right]}\n\\newcommand{\\Cov}[2]{\\mathrm{Cov}\\left[#1,\\ #2\\right]}\n\\newcommand{\\given}{\\ \\vert\\ }\n\\newcommand{\\X}{\\mathbf{X}}\n\\newcommand{\\x}{\\mathbf{x}}\n\\newcommand{\\y}{\\mathbf{y}}\n\\newcommand{\\P}{\\mathcal{P}}\n\\newcommand{\\R}{\\mathbb{R}}\n\\newcommand{\\norm}[1]{\\left\\lVert #1 \\right\\rVert}\n\\newcommand{\\snorm}[1]{\\lVert #1 \\rVert}\n\\newcommand{\\tr}[1]{\\mbox{tr}(#1)}\n\\newcommand{\\brt}{\\widehat{\\beta}^R_{s}}\n\\newcommand{\\brl}{\\widehat{\\beta}^R_{\\lambda}}\n\\newcommand{\\bls}{\\widehat{\\beta}_{ols}}\n\\newcommand{\\blt}{\\widehat{\\beta}^L_{s}}\n\\newcommand{\\bll}{\\widehat{\\beta}^L_{\\lambda}}\n\\newcommand{\\U}{\\mathbf{U}}\n\\newcommand{\\D}{\\mathbf{D}}\n\\newcommand{\\V}{\\mathbf{V}}\n$$\n\n\n\n\n\n## Last time\n\n\nWe showed that with two classes, the [Bayes' classifier]{.secondary} is\n\n$$g_*(x) = \\begin{cases}\n1 & \\textrm{ if } \\frac{p_1(x)}{p_0(x)} > \\frac{1-\\pi}{\\pi} \\\\\n0 & \\textrm{ otherwise}\n\\end{cases}$$\n\nwhere $p_1(x) = \\Pr(X=x \\given Y=1)$, $p_0(x) = \\Pr(X=x \\given Y=0)$ and $\\pi = \\Pr(Y=1)$\n\n. . .\n\nFor more than two classes:\n\n$$g_*(x) = \n\\argmax_k \\frac{\\pi_k p_k(x)}{\\sum_k \\pi_k p_k(x)}$$\n\nwhere $p_k(x) = \\Pr(X=x \\given Y=k)$ and $\\pi_k = P(Y=k)$\n\n\n## Estimating these\n \nLet's make some assumptions:\n\n1. $\\Pr(X=x\\given Y=k) = \\mbox{N}(x; \\mu_k,\\Sigma_k)$\n2. $\\Sigma_k = \\Sigma_{k'} = \\Sigma$\n\nThis leads to [Linear Discriminant Analysis]{.secondary} (LDA), one of the oldest classifiers\n\n## LDA\n\n\n1. Split your training data into $K$ subsets based on $y_i=k$.\n2. In each subset, estimate the mean of $X$: $\\widehat\\mu_k = \\overline{X}_k$\n3. Estimate the pooled variance: $$\\widehat\\Sigma = \\frac{1}{n-K} \\sum_{k \\in \\mathcal{K}} \\sum_{i \\in k} (x_i - \\overline{X}_k) (x_i - \\overline{X}_k)^{\\top}$$\n4. Estimate the class proportion: $\\widehat\\pi_k = n_k/n$\n\n## LDA\n\nAssume just $K = 2$ so $k \\in \\{0,\\ 1\\}$\n\nWe predict $\\widehat{y} = 1$ if\n\n$$\\widehat{p_1}(x) / \\widehat{p_0}(x) > \\widehat{\\pi_0} / \\widehat{\\pi_1}$$ \n\nPlug in the density estimates:\n\n$$\\widehat{p_k}(x) = N(x - \\widehat{\\mu}_k,\\ \\widehat\\Sigma)$$\n\n\n## LDA\n\n\nNow we take $\\log$ and simplify $(K=2)$:\n\n$$\n\\begin{aligned}\n&\\Rightarrow \\log(\\widehat{p_1}(x)\\times\\widehat{\\pi_1}) - \\log(\\widehat{p_0}(x)\\times\\widehat{\\pi_0})\n= \\cdots = \\cdots\\\\\n&= \\underbrace{\\left(x^\\top\\widehat\\Sigma^{-1}\\overline X_1-\\frac{1}{2}\\overline X_1^\\top \\widehat\\Sigma^{-1}\\overline X_1 + \\log \\widehat\\pi_1\\right)}_{\\delta_1(x)} - \\underbrace{\\left(x^\\top\\widehat\\Sigma^{-1}\\overline X_0-\\frac{1}{2}\\overline X_0^\\top \\widehat\\Sigma^{-1}\\overline X_0 + \\log \\widehat\\pi_0\\right)}_{\\delta_0(x)}\\\\\n&= \\delta_1(x) - \\delta_0(x)\n\\end{aligned}\n$$\n\n\n[If $\\delta_1(x) > \\delta_0(x)$, we set $\\widehat g(x)=1$]{.secondary}\n\n## One dimensional intuition\n\n\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nset.seed(406406406)\nn <- 100\npi <- .6\nmu0 <- -1\nmu1 <- 2\nsigma <- 2\ntib <- tibble(\n y = rbinom(n, 1, pi),\n x = rnorm(n, mu0, sigma) * (y == 0) + rnorm(n, mu1, sigma) * (y == 1)\n)\n```\n:::\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code code-fold=\"true\"}\ngg <- ggplot(tib, aes(x, y)) +\n geom_point(colour = blue) +\n stat_function(fun = ~ 6 * (1 - pi) * dnorm(.x, mu0, sigma), colour = orange) +\n stat_function(fun = ~ 6 * pi * dnorm(.x, mu1, sigma), colour = orange) +\n annotate(\"label\",\n x = c(-3, 4.5), y = c(.5, 2 / 3),\n label = c(\"(1-pi)*p[0](x)\", \"pi*p[1](x)\"), parse = TRUE\n )\ngg\n```\n\n::: {.cell-output-display}\n![](15-LDA-and-QDA_files/figure-revealjs/unnamed-chunk-2-1.svg){fig-align='center'}\n:::\n:::\n\n\n\n\n\n\n## What is linear?\n\nLook closely at the equation for $\\delta_1(x)$:\n\n$$\\delta_1(x)=x^\\top\\widehat\\Sigma^{-1}\\overline X_1-\\frac{1}{2}\\overline X_1^\\top \\widehat\\Sigma^{-1}\\overline X_1 + \\log \\widehat\\pi_1$$\n\nWe can write this as $\\delta_1(x) = x^\\top a_1 + b_1$ with $a_1 = \\widehat\\Sigma^{-1}\\overline X_1$ and $b_1=-\\frac{1}{2}\\overline X_1^\\top \\widehat\\Sigma^{-1}\\overline X_1 + \\log \\widehat\\pi_1$.\n\nWe can do the same for $\\delta_0(x)$ (in terms of $a_0$ and $b_0$)\n\nTherefore, \n\n$$\\delta_1(x)-\\delta_0(x) = x^\\top(a_1-a_0) + (b_1-b_0)$$\n\nThis is how we discriminate between the classes.\n\nWe just calculate $(a_1 - a_0)$ (a vector in $\\R^p$), and $b_1 - b_0$ (a scalar)\n\n\n## Baby example\n\n::: flex\n::: w-50\n\n\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(mvtnorm)\nlibrary(MASS)\ngenerate_lda_2d <- function(\n n, p = c(.5, .5), \n mu = matrix(c(0, 0, 1, 1), 2),\n Sigma = diag(2)) {\n X <- rmvnorm(n, sigma = Sigma)\n tibble(\n y = which(rmultinom(n, 1, p) == 1, TRUE)[,1],\n x1 = X[, 1] + mu[1, y],\n x2 = X[, 2] + mu[2, y]\n )\n}\ndat1 <- generate_lda_2d(100, Sigma = .5 * diag(2))\nlda_fit <- lda(y ~ ., dat1)\n```\n:::\n\n\n\n\n:::\n::: w-50\n\n\n\n\n::: {.cell layout-align=\"center\" dvi='300'}\n::: {.cell-output-display}\n![](15-LDA-and-QDA_files/figure-revealjs/plot-d1-1.png){fig-align='center'}\n:::\n:::\n\n\n\n\n:::\n\n:::\n\n\n## Multiple classes\n\n\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nmoreclasses <- generate_lda_2d(150, c(.2, .3, .5), matrix(c(0, 0, 1, 1, 1, 0), 2), .5 * diag(2))\nseparateclasses <- generate_lda_2d(150, c(.2, .3, .5), matrix(c(-1, -1, 2, 2, 2, -1), 2), .1 * diag(2))\n```\n:::\n\n::: {.cell layout-align=\"center\" dvi='300'}\n::: {.cell-output-display}\n![](15-LDA-and-QDA_files/figure-revealjs/3class-plot-1.png){fig-align='center'}\n:::\n:::\n\n\n\n\n\n\n## QDA\n\nJust like LDA, but $\\Sigma_k$ is separate for each class.\n\nProduces [Quadratic]{.secondary} decision boundary.\n\nEverything else is the same.\n\n\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nqda_fit <- qda(y ~ ., dat1)\nqda_3fit <- qda(y ~ ., moreclasses)\n```\n:::\n\n::: {.cell layout-align=\"center\" dvi='300'}\n::: {.cell-output-display}\n![](15-LDA-and-QDA_files/figure-revealjs/qda-vs-lda-2class-1.png){fig-align='center'}\n:::\n:::\n\n\n\n\n\n## 3 class comparison\n\n\n\n\n::: {.cell layout-align=\"center\" dvi='300'}\n::: {.cell-output-display}\n![](15-LDA-and-QDA_files/figure-revealjs/3class-comparison-1.png){fig-align='center'}\n:::\n:::\n\n\n\n\n\n## Notes\n\n* LDA is a linear classifier. It is not a linear smoother.\n - It is derived from Bayes rule.\n - Assume each class-conditional density in Gaussian\n - It assumes the classes have different mean vectors, but the same (common) covariance matrix.\n - It estimates densities and probabilities and \"plugs in\" \n\n* QDA is not a linear classifier. It depends on quadratic functions of the data.\n - It is derived from Bayes rule.\n - Assume each class-conditional density in Gaussian\n - It assumes the classes have different mean vectors and different covariance matrices.\n - It estimates densities and probabilities and \"plugs in\" \n \n##\n\n[It is hard (maybe impossible) to come up with reasonable classifiers that are linear smoothers. Many \"look\" like a linear smoother, but then apply a nonlinear transformation.]{.hand}\n\n## Naïve Bayes\n\nAssume that $\\Pr(X=x | Y = k) = \\Pr(X_1=x_1 | Y = k)\\cdots \\Pr(X_p=x_p | Y = k)$.\n\nThat is, conditional on the class, the feature distribution is independent.\n\n. . .\n\nIf we further assume that $\\Pr(X_j=x_j | Y = k)$ is Gaussian,\n\nThis is the same as QDA but with $\\Sigma_k$ Diagonal.\n\n. . .\n\nDon't have to assume Gaussian. Could do lots of stuff. \n\n\n# Next time...\n\nAnother linear classifier and transformations\n",
"supporting": [
"15-LDA-and-QDA_files"
],
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit ea1b16d

Please sign in to comment.