class: center, middle, inverse, title-slide # Inference ## Lecture 10 ###
Louis SIRUGUE ### CPES 2 - Fall 2022 --- <style> .left-column {width: 70%;} .right-column {width: 30%;} </style> ### Quick reminder #### 1. Regression .pull-left[ <p style = "margin-bottom:-.75cm;"></p> <img src="slides_files/figure-html/unnamed-chunk-2-1.png" width="100%" style="display: block; margin: auto;" /> <p style = "margin-bottom:-1.2cm;"></p> ``` ## ## Call: ## lm(formula = y ~ x, data = data) ## ## Coefficients: ## (Intercept) x ## -0.09129 1.01546 ``` ] -- .pull-right[ * This can be expressed with the **regression equation:** `$$y_i = \hat{\alpha} + \hat{\beta}x_i + \hat{\varepsilon_i}$$` * Where `\(\hat{\alpha}\)` is the **intercept** and `\(\hat{\beta}\)` the **slope** of the **line** `\(\hat{y_i} = \hat{\alpha} + \hat{\beta}x_i\)`, and `\(\hat{\varepsilon_i}\)` the **distances** between the points and the line <p style = "margin-bottom:1cm;"> `$$\hat{\beta} = \frac{\text{Cov}(x_i, y_i)}{\text{Var}(x_i)}$$` `$$\hat{\alpha} = \bar{y} - \hat{\beta} \times\bar{x}$$` * `\(\hat{\alpha}\)` and `\(\hat{\beta}\)` minimize `\(\hat{\varepsilon_i}\)` ] --- ### Quick reminder #### 2. Multivariate regressions <ul> <li><b>Adding</b> a second independent <b>variable</b> in the regression amounts to <b>fitting a plane</b> instead of a line</li> <ul> <li>Adding a third variable would fit a hyperplane of dimension 3 and so on</li> </ul> </ul> -- .pull-left[ <center><b>Adding a continuous variable</b></center>
] .pull-right[ <center><b>Adding a discrete variable</b></center>
] --- ### Quick reminder #### 3. Control variables <ul> <li>Adding a third variable \(z\) <b>removes</b> its potential <b>confounding effect</b> from the relationship between \(x\) and \(y\)</li> <ul> <li>As we move along the \(x\) axis, the <b>third variable remains constant</b></li> </ul> </ul> -- `$$\hat{y_i} = \hat{\alpha} + \hat{\beta_1} x + \hat{\beta_2} z + \hat{\varepsilon_i}$$` -- <img src="slides_files/figure-html/unnamed-chunk-6-1.gif" width="55%" style="display: block; margin: auto;" /> --- ### Quick reminder #### 4. Interactions <ul> <li>Adding an <b>interaction</b> term with \(z\) allows to see <b>how the effect</b> of \(x\) on \(y\) <b>varies</b> with \(z\)</li> <ul> <li>If \(z\) is <b>discrete</b>, it amounts to <b>regressing</b> \(y\) on \(x\) <b>separately</b> for each \(z\) group</li> </ul> </ul> -- `$$\hat{y_i} = \hat{\alpha} + \hat{\beta_1} x + \hat{\beta_2} z + \hat{\beta_3}(x \times z)+ \hat{\varepsilon_i}$$` -- <img src="slides_files/figure-html/unnamed-chunk-7-1.png" width="55%" style="display: block; margin: auto;" /> --- <h3>Today: Inference</h3> -- <p style = "margin-bottom:3cm;"></p> .pull-left[ <ul style = "margin-left:1.5cm;list-style: none"> <li><b>1. Asymptotic inference</b></li> <ul style = "list-style: none"> <li>1.1. Data generating process</li> <li>1.2. Standardization</li> <li>1.3. Confidence interval</li> </ul> </ul> <p style = "margin-bottom:1cm;"></p> <ul style = "margin-left:1.5cm;list-style: none"> <li><b>2. Exact inference</b></li> <ul style = "list-style: none"> <li>2.1. Standard error</li> <li>2.2. Student-t distribution</li> <li>2.3. Confidence interval</li> </ul> </ul> ] .pull-right[ <ul style = "margin-left:-1cm;list-style: none"> <li><b>3. Hypothesis testing</b></li> <ul style = "list-style: none"> <li>3.1. P-value</li> <li>3.2. linearHypothesis()</li> </ul> </ul> <p style = "margin-bottom:1.75cm;"></p> <ul style = "margin-left:-1cm;list-style: none"><li><b>4. Wrap up!</b></li></ul> ] --- <h3>Today: Inference</h3> <p style = "margin-bottom:3cm;"></p> .pull-left[ <ul style = "margin-left:1.5cm;list-style: none"> <li><b>1. Asymptotic inference</b></li> <ul style = "list-style: none"> <li>1.1. Data generating process</li> <li>1.2. Standardization</li> <li>1.3. Confidence interval</li> </ul> </ul> ] --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>In Part I of the course, we distinguished the <b>empirical moments</b> from the <b>theoretical moments</b></li> <ul> <li>Like the <i>empirical mean</i> is a <b>finite-sample estimation</b> of the <i>theoretical expected value</i></li> <li>The same principle applies to <b>regression coefficients</b></li> </ul> </ul> -- <ul> <li>Take our Great Gatsby Curve for instance</li> <ul> <li></li> <li></li> </ul> </ul> .left-column[ <p style = "margin-bottom:-.5cm;"></p> <img src="slides_files/figure-html/unnamed-chunk-8-1.png" width="65%" style="display: block; margin: auto;" /> ] --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>In Part I of the course, we distinguished the <b>empirical moments</b> from the <b>theoretical moments</b></li> <ul> <li>Like the <i>empirical mean</i> is a <b>finite-sample estimation</b> of the <i>theoretical expected value</i></li> <li>The same principle applies to <b>regression coefficients</b></li> </ul> </ul> <ul> <li>Take our Great Gatsby Curve for instance</li> <ul> <li>Had our <b>sample</b> of countries been a bit <b>different</b>, our <b>coefficients</b> would <b>not be the same</b></li> <li></li> </ul> </ul> .left-column[ <p style = "margin-bottom:-.5cm;"></p> <img src="slides_files/figure-html/unnamed-chunk-9-1.png" width="65%" style="display: block; margin: auto;" /> ] --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>In Part I of the course, we distinguished the <b>empirical moments</b> from the <b>theoretical moments</b></li> <ul> <li>Like the <i>empirical mean</i> is a <b>finite-sample estimation</b> of the <i>theoretical expected value</i></li> <li>The same principle applies to <b>regression coefficients</b></li> </ul> </ul> <ul> <li>Take our Great Gatsby Curve for instance</li> <ul> <li>Had our <b>sample</b> of countries been a bit <b>different</b>, our <b>coefficients</b> would <b>not be the same</b></li> <li>But they would all be <b>estimations</b> of a true relationship whose <b>data-generating</b> process is <b>unobserved</b></li> </ul> </ul> .left-column[ <p style = "margin-bottom:-.5cm;"></p> <img src="slides_files/figure-html/unnamed-chunk-10-1.png" width="65%" style="display: block; margin: auto;" /> ] -- .right-column[ <p style = "margin-bottom:2cm;"></p> <center><b>Then how to asses the reliability of our estimation?</b></center> ] --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>For <b>simplicity</b>, let's work with a relationship whose <b>DGP is know</b></li> <ul> <li>Such that we can <b>understand how estimations</b> from random samples <b>behave relative to the DGP</b></li> <li>Let's <b>generate data in R!</b></li> </ul> </ul> -- <p style = "margin-bottom:1.25cm;"></p> <ul> <li>We can use <b>functions</b> that output <b>random draws from given distributions</b> whose parameters can be chosen</li> </ul> -- <p style = "margin-bottom:1.25cm;"></p> .pull-left[ <center><b>Normal distribution</b></center> ➜ Sample size, expected value, standard deviation ```r rnorm(n = 10, mean = 100, sd = 5) ``` ```text ## [1] 103.48482 102.78332 96.55622 ## [4] 96.46252 101.82291 103.84266 ## [7] 99.43827 104.40554 101.99053 ## [10] 96.93987 ``` ] .pull-right[ <center><b>Uniform distribution</b></center> ➜ Sample size, lower bound, upper bound ```r runif(n = 10, min = 4, max = 5) ``` ```text ## [1] 4.633493 4.213208 4.129372 ## [4] 4.478118 4.924074 4.598761 ## [7] 4.976171 4.731793 4.356727 ## [10] 4.431474 ``` ] --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>Consider the following <b>data generating process:</b></li> </ul> `$$y = -2 + 0.4 \times x + \varepsilon \:\:\:\begin{cases} x \sim \mathcal{N}(4,\,25)\\ \varepsilon \sim \mathcal{N}(0,\,1) \end{cases}$$` -- <ul> <li>We can randomly draw <b>1,000 observations</b from this DGP as follows</li> </ul> ```r dt <- tibble(x = rnorm(1000, 4, 5), e = rnorm(1000, 0, 1), y = -2 + (.4 * x) + e) ``` -- <p style = "margin-bottom:1cm"></p> <center><b>Check the empirical moments:</b></center> .pull-left[ ```r c(mean(dt$x), var(dt$x)) ``` ``` ## [1] 4.127842 26.816044 ``` ] .pull-right[ ```r c(mean(dt$e), var(dt$e)) ``` ``` ## [1] 0.009459055 1.070444754 ``` ] --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>Because the randomly drawn <b>sample is finite</b>, it <b>does not match exactly</b> the features of the DGP:</li> </ul> -- <img src="slides_files/figure-html/unnamed-chunk-18-1.png" width="85%" style="display: block; margin: auto;" /> --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>Same thing for the <b>coefficients</b> of the relationship between \(x\) and \(y\):</li> </ul> -- ```r lm(y ~ x, dt) ``` ``` ## ## Call: ## lm(formula = y ~ x, data = dt) ## ## Coefficients: ## (Intercept) x ## -2.0044 0.4033 ``` -- <p style = "margin-bottom:1.5cm"></p> <ul> <li>But what would happen if we were to <b>redo this operation many times?</b></li> <ol> <li><b>Draw a random sample</b> from the DGP</li> <li><b>Compute the slope</b> of the regression of \(y\) on \(x\)</li> <li>Do it many times and <b>store the coefficients</b></li> </ul> --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>We can <b>use a loop</b> to do that:</li> <ul> <li></li> <li></li> <li></li> </ul> </ul> <p style = "margin-bottom:1cm"></p> ```r # for (i in 1:1000) { # # # # # } ``` --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>We can <b>use a loop</b> to do that:</li> <ul> <li>First we create an empty vector</li> <li></li> <li></li> </ul> </ul> <p style = "margin-bottom:1cm"></p> ```r beta <- c() for (i in 1:1000) { # # # # # } ``` --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>We can <b>use a loop</b> to do that:</li> <ul> <li>First we create an empty vector</li> <li>Then we put the code in a loop</li> <li></li> </ul> </ul> <p style = "margin-bottom:1cm"></p> ```r beta <- c() for (i in 1:1000) { dt_i <- tibble(x = rnorm(1000, 4, 5), e = rnorm(1000, 0, 1), y = -2 + (.4 * x) + e) reg_i <- lm(y ~ x, dt_i) # } ``` --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>We can <b>use a loop</b> to do that:</li> <ul> <li>First we create an empty vector</li> <li>Then we put the code in a loop</li> <li>And we fill the vector at each iteration</li> </ul> </ul> <p style = "margin-bottom:1cm"></p> ```r beta <- c() for (i in 1:1000) { dt_i <- tibble(x = rnorm(1000, 4, 5), e = rnorm(1000, 0, 1), y = -2 + (.4 * x) + e) reg_i <- lm(y ~ x, dt_i) beta <- c(beta, reg_i$coefficients[2]) } ``` --- ### 1. Asymptotic inference #### 1.1. Data generating process <ul> <li>We now have <b>1,000 slope coefficients</b> from 1,000 random samples of the <b>same DGP</b></li> </ul> .left-column[ <img src="slides_files/figure-html/unnamed-chunk-24-1.png" width="95%" style="display: block; margin: auto auto auto 0;" /> ] -- .right.column[ <p style = "margin-bottom:1.25cm"></p> <ul style = "text-align:left;"> <li style = "margin-bottom:1cm">Some random samples give higher estimates than others</li> <li style = "margin-bottom:1cm">But <b>on expectation</b> we get the <b>right coefficient!</b></li> <li style = "margin-bottom:1cm">The \(\hat{\beta}\)s actually follow a <b>normal distribution</b></li> <li style = "margin-bottom:1cm">And at the limit their mean would <b>converge towards \(\beta\)</b></li> </ul> ] --- ### 1. Asymptotic inference #### 1.2. Standardization <ul> <li style = "margin-bottom:.12cm">That is crucial information because it allows to get back to something we know:</li> <ul> <li style = "margin-bottom:.12cm"></li> <li></li> <li></li> </ul> </ul> .left-column[ <img src="slides_files/figure-html/unnamed-chunk-25-1.png" width="80%" style="display: block; margin: auto auto auto 0;" /> ] .right-column[ <p style = "margin-bottom:1.25cm"></p> `$$\hat{\beta}$$` ] --- ### 1. Asymptotic inference #### 1.2. Standardization <ul> <li>That is crucial information because it allows to get back to something we know:</li> <ul> <li style = "margin-bottom:.12cm">By subtracting \(\beta\) from the distribution of \(\hat{\beta}\)</li> <li></li> <li></li> </ul> </ul> .left-column[ <img src="slides_files/figure-html/unnamed-chunk-26-1.png" width="80%" style="display: block; margin: auto auto auto 0;" /> ] .right-column[ <p style = "margin-bottom:1.25cm"></p> `$$\hat{\beta}-\beta$$` ] --- ### 1. Asymptotic inference #### 1.2. Standardization <ul> <li>That is crucial information because it allows to get back to something we know:</li> <ul> <li>By subtracting \(\beta\) from the distribution of \(\hat{\beta}\)</li> <li>And dividing by the standard deviation of \(\hat{\beta}\)</li> <li></li> </ul> </ul> .left-column[ <img src="slides_files/figure-html/unnamed-chunk-27-1.png" width="80%" style="display: block; margin: auto auto auto 0;" /> ] .right-column[ <p style = "margin-bottom:1.25cm"></p> `$$\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})}$$` ] --- ### 1. Asymptotic inference #### 1.2. Standardization <ul> <li>That is crucial information because it allows to get back to something we know:</li> <ul> <li>By subtracting \(\beta\) from the distribution of \(\hat{\beta}\)</li> <li>And dividing by the standard deviation of \(\hat{\beta}\)</li> <li>With an infinite sample we would obtain the standard normal distribution</li> </ul> </ul> .left-column[ <img src="slides_files/figure-html/unnamed-chunk-28-1.png" width="80%" style="display: block; margin: auto auto auto 0;" /> ] .right-column[ <p style = "margin-bottom:1.25cm"></p> `$$\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})} \sim \mathcal{N}(0, 1)$$` ] --- ### 1. Asymptotic inference #### 1.3. Confidence interval <ul> <li>We can use the fact that we know the standard normal distribution:</li> <ul> <li></li> <li></li> <li></li> </ul> </ul> .left-column[ <img src="slides_files/figure-html/unnamed-chunk-29-1.png" width="80%" style="display: block; margin: auto auto auto 0;" /> <p style = "margin-top:-8cm;margin-left:17.5cm"> `$$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})} \sim \mathcal{N}(0, 1)$$` ] --- ### 1. Asymptotic inference #### 1.3. Confidence interval <ul> <li>We can use the fact that we know the standard normal distribution:</li> <ul> <li>That 99% of the distribution lie between \(\pm\) 2.58</li> <li></li> <li></li> </ul> </ul> .left-column[ <img src="slides_files/figure-html/unnamed-chunk-30-1.png" width="80%" style="display: block; margin: auto auto auto 0;" /> <p style = "margin-top:-8cm;margin-left:17.5cm"> `$$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})} \sim \mathcal{N}(0, 1)$$` `$$\text{Pr}\left[-2.58<\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})}<2.58\right] \approx 99\%$$` ] --- ### 1. Asymptotic inference #### 1.3. Confidence interval <ul> <li>We can use the fact that we know the standard normal distribution:</li> <ul> <li>That 99% of the distribution lie between \(\pm\) 2.58</li> <li>That 95% of the distribution lie between \(\pm\) 1.96</li> <li></li> </ul> </ul> .left-column[ <img src="slides_files/figure-html/unnamed-chunk-31-1.png" width="80%" style="display: block; margin: auto auto auto 0;" /> <p style = "margin-top:-8cm;margin-left:17.5cm"> `$$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})} \sim \mathcal{N}(0, 1)$$` `$$\text{Pr}\left[-1.96<\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})}<1.96\right] \approx 95\%$$` ] --- ### 1. Asymptotic inference #### 1.3. Confidence interval <ul> <li>We can use the fact that we know the standard normal distribution:</li> <ul> <li>That 99% of the distribution lie between \(\pm\) 2.58</li> <li>That 95% of the distribution lie between \(\pm\) 1.96</li> <li>This is what allows to determine confidence intervals</li> </ul> </ul> .left-column[ <img src="slides_files/figure-html/unnamed-chunk-32-1.png" width="80%" style="display: block; margin: auto auto auto 0;" /> <p style = "margin-top:-8cm;margin-left:17.5cm"> `$$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})} \sim \mathcal{N}(0, 1)$$` `$$\text{Pr}\left[-1.96<\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})}<1.96\right] \approx 95\%$$` `$$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\Downarrow$$` `$$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\text{Confidence interval}$$` ] --- ### 1. Asymptotic inference #### 1.3. Confidence interval `$$\text{Pr}\left[-1.96<\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})}<1.96\right] \approx 95\%$$` -- <p style = "margin-bottom:1.25cm"></p> `$$\text{Pr}\left[-1.96\times\color{SkyBlue}{\text{SD}(\hat{\beta})}<\hat{\beta}-\beta<1.96\times\color{SkyBlue}{\text{SD}(\hat{\beta})}\right] \approx 95\%$$` -- <p style = "margin-bottom:1.25cm"></p> `$$\text{Pr}\left[-1.96\times\text{SD}(\hat{\beta})\color{SkyBlue}{-\hat{\beta}}<-\beta<1.96\times\text{SD}(\hat{\beta})\color{SkyBlue}{-\hat{\beta}}\right] \approx 95\%$$` -- <p style = "margin-bottom:1.25cm"></p> `$$\text{Pr}\left[\color{SkyBlue}{+}1.96\times\text{SD}(\hat{\beta}) \color{SkyBlue}{+} \hat{\beta}\color{SkyBlue}{>}\beta\color{SkyBlue}{>}\color{SkyBlue}{-}1.96\times\text{SD}(\hat{\beta})\color{SkyBlue}{+}\hat{\beta}\right] \approx 95\%$$` -- <p style = "margin-bottom:1.25cm"></p> `$$\text{CI}_{95\%}: \:\hat{\beta}\pm 1.96\times\text{SD}(\hat{\beta})$$` --- <h3>Overview</h3> <p style = "margin-bottom:3cm;"></p> .pull-left[ <ul style = "margin-left:1.5cm;list-style: none"> <li><b>1. Asymptotic inference ✔</b></li> <ul style = "list-style: none"> <li>1.1. Data generating process</li> <li>1.2. Standardization</li> <li>1.3. Confidence interval</li> </ul> </ul> <p style = "margin-bottom:1cm;"></p> <ul style = "margin-left:1.5cm;list-style: none"> <li><b>2. Exact inference</b></li> <ul style = "list-style: none"> <li>2.1. Standard error</li> <li>2.2. Student-t distribution</li> <li>2.3. Confidence interval</li> </ul> </ul> ] .pull-right[ <ul style = "margin-left:-1cm;list-style: none"> <li><b>3. Hypothesis testing</b></li> <ul style = "list-style: none"> <li>3.1. P-value</li> <li>3.2. linearHypothesis()</li> </ul> </ul> <p style = "margin-bottom:1.75cm;"></p> <ul style = "margin-left:-1cm;list-style: none"><li><b>4. Wrap up!</b></li></ul> ] --- <h3>Overview</h3> <p style = "margin-bottom:3cm;"></p> .pull-left[ <ul style = "margin-left:1.5cm;list-style: none"> <li><b>1. Asymptotic inference ✔</b></li> <ul style = "list-style: none"> <li>1.1. Data generating process</li> <li>1.2. Standardization</li> <li>1.3. Confidence interval</li> </ul> </ul> <p style = "margin-bottom:1cm;"></p> <ul style = "margin-left:1.5cm;list-style: none"> <li><b>2. Exact inference</b></li> <ul style = "list-style: none"> <li>2.1. Standard error</li> <li>2.2. Student-t distribution</li> <li>2.3. Confidence interval</li> </ul> </ul> ] --- ### 2. Exact inference #### 2.1. Standard error <ul> <li>In the <b>previous section</b> I used phrases like <i>"at the limit"</i> or <b><i>"with an infinite sample"</i></b></li> <ul> <li>But <b>in practice</b> this is <b>not the case</b>, so things behave slightly differently</li> <li>And this implies to make a few <b>statistical adjustments</b> to account for that</li> </ul> </ul> <p style = "margin-bottom:1.15cm;"></p> -- `$$\hat{\beta}\pm 1.96\times\color{SkyBlue}{\text{SD}(\hat{\beta})}$$` <p style = "margin-bottom:1.15cm;"></p> <ul> <li>First we <b>cannot measure</b> directly the <b>standard deviation</b> of \(\hat{\beta}\)</li> <ul> <li>Indeed in practice we have <b>only one observation</b> of \(\hat{\beta}\), not its whole distribution</li> <li>But like for the mean, we can compute a <b>standard error</b> instead</li> </ul> </ul> -- <p style = "margin-bottom:1.15cm;"></p> .pull-left[ <center><b>Standard deviation</b></center> <p style = "margin-bottom:.5cm;"></p> <center><i>➜ Measures the amount of variability, or dispersion, from the individual data values to the mean</i></center> ] .pull-right[ <center><b>Standard error</b></center> <p style = "margin-bottom:.5cm;"></p> <center><i>➜ Measures how far an estimate from a given sample is likely to be from the true parameter of interest</i></center> ] --- ### 2. Exact inference #### 2.1. Standard error <ul> <li>We won't go through the theoretical computations together, but let's have a look at the formula:</li> </ul> <p style = "margin-bottom:1cm;"></p> `$$\text{se}(\hat{\beta}) = \sqrt{\widehat{\text{Var}(\hat{\beta})}} = \sqrt{\frac{\sum_{i = 1}^n\hat{\varepsilon_i}^2}{(n-\#\text{parameters})\sum_{i = 1}^n(x_i-\bar{x})^2}}$$` <p style = "margin-bottom:1.25cm;"></p> -- <ul> <li>Notice that the variance, and thus the standard error of our estimate, decreases as:</li> </ul> --- ### 2. Exact inference #### 2.1. Standard error <ul> <li>We won't go through the theoretical computations together, but let's have a look at the formula:</li> </ul> <p style = "margin-bottom:1cm;"></p> `$$\text{se}(\hat{\beta}) = \sqrt{\widehat{\text{Var}(\hat{\beta})}} = \sqrt{\frac{\sum_{i = 1}^n\hat{\varepsilon_i}^2}{(\color{SkyBlue}{n}-\#\text{parameters})\sum_{i = 1}^n(x_i-\bar{x})^2}}$$` <p style = "margin-bottom:1.25cm;"></p> <ul> <li>Notice that the variance, and thus the standard error of our estimate, decreases as:</li> <ul> <li>The <span style = "color:#87CEEB;">number of of observations</span> gets bigger</li> </ul> </ul> --- ### 2. Exact inference #### 2.1. Standard error <ul> <li>We won't go through the theoretical computations together, but let's have a look at the formula:</li> </ul> <p style = "margin-bottom:1cm;"></p> `$$\text{se}(\hat{\beta}) = \sqrt{\widehat{\text{Var}(\hat{\beta})}} = \sqrt{\frac{\sum_{i = 1}^n\hat{\varepsilon_i}^2}{(n-\color{SkyBlue}{\#\text{parameters}})\sum_{i = 1}^n(x_i-\bar{x})^2}}$$` <p style = "margin-bottom:1.25cm;"></p> <ul> <li>Notice that the variance, and thus the standard error of our estimate, decreases as:</li> <ul> <li>The number of of observations gets bigger</li> <li>The <span style = "color:#87CEEB;">number of parameters</span> decreases</li> </ul> </ul> --- ### 2. Exact inference #### 2.1. Standard error <ul> <li>We won't go through the theoretical computations together, but let's have a look at the formula:</li> </ul> <p style = "margin-bottom:1cm;"></p> `$$\text{se}(\hat{\beta}) = \sqrt{\widehat{\text{Var}(\hat{\beta})}} = \sqrt{\frac{\color{SkyBlue}{\sum_{i = 1}^n\hat{\varepsilon_i}^2}}{(n-\#\text{parameters})\color{SkyBlue}{\sum_{i = 1}^n(x_i-\bar{x})^2}}}$$` <p style = "margin-bottom:1.25cm;"></p> <ul> <li>Notice that the variance, and thus the standard error of our estimate, decreases as:</li> <ul> <li>The number of of observations gets bigger</li> <li>The number of parameters decreases</li> <li>The <span style = "color:#87CEEB;">sum of squared errors relative to the variance of \(x\)</span> decreases</li> </ul> </ul> --- ### 2. Exact inference #### 2.1. Standard error <ul> <li>We won't go through the theoretical computations together, but let's have a look at the formula:</li> </ul> <p style = "margin-bottom:1cm;"></p> `$$\text{se}(\hat{\beta}) = \sqrt{\widehat{\text{Var}(\hat{\beta})}} = \sqrt{\frac{\sum_{i = 1}^n\hat{\varepsilon_i}^2}{(n-\#\text{parameters})\sum_{i = 1}^n(x_i-\bar{x})^2}}$$` <p style = "margin-bottom:1.25cm;"></p> <ul> <li>Notice that the variance, and thus the standard error of our estimate, decreases as:</li> <ul> <li>The number of of observations gets bigger</li> <li>The number of parameters decreases</li> <li>The sum of squared errors decreases relative to the variance of \(x\)</li> </ul> </ul> <p style = "margin-bottom:1cm;"></p> <ul> <li>And as the <b>standard error</b> gets <b>bigger, the <b>confidence interval</b> gets <b>bigger</b>:</li> <ul> <p style = "margin-bottom:.75cm;"></p> `$$\hat{\beta}\pm 1.96\times\color{SkyBlue}{\text{se}(\hat{\beta})}$$` --- ### 2. Exact inference #### 2.2. Student-t distribution <ul> <li>But that's not it, remember that we took the value <b>1.96</b> from the <b>normal distribution</b></li> </ul> -- <p style = "margin-bottom:.75cm;"></p> `$$\hat{\beta}\pm \color{SkyBlue}{1.96}\times\text{se}(\hat{\beta})$$` -- <p style = "margin-bottom:.75cm;"></p> .left-column[ <img src="slides_files/figure-html/unnamed-chunk-33-1.png" width="80%" style="display: block; margin: auto auto auto 0;" /> ] -- .right-column[ <ul style = "margin-top:-.5cm;margin-left:-3.25cm;"> <li style = "margin-bottom:.5cm">But the <b>normal distribution</b> is what \(\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})}\) converges to <i><b>at the limit</b></i></li> <li style = "margin-bottom:.5cm">In the <b>finite</b> world, \(\frac{\hat{\beta}-\beta}{\text{se}(\hat{\beta})}\) follows a slightly flatter distribution</li> <li>The <b>Student \(t\) distribution</b>, whose precise shape depends on the number of observations we have and parameters we estimate</li> </ul> ] --- ### 2. Exact inference #### 2.2. Student-t distribution <ul> <li>The Student \(t\) distribution <b>accounts for</b> the fact that the <b>sample is finite</b></li> <ul> <li>The lower the number of <b>degrees of freedom</b> (#observations - #parameters) the flatter</li> <li>And it <b>tends to a normal</b> distribution as the number of degrees of freedom \(\rightarrow \infty\)</li> </ul> </ul> <img src="slides_files/figure-html/unnamed-chunk-34-1.png" width="75%" style="display: block; margin: auto;" /> --- ### 2. Exact inference #### 2.2. Student-t distribution <ul> <li>But <b>we know the Student</b> \(t\) <b>distributions</b> just as well as the standard normal distribution</li> </ul> <p style = "margin-bottom:-.52cm"></p> -- <ul> <ul> <li>With 100 degrees of freedom, 95% of the distribution lie between \(\pm\) 1.98</li> <li></li> </ul> </ul> <img src="slides_files/figure-html/unnamed-chunk-35-1.png" width="70%" style="display: block; margin: auto;" /> --- ### 2. Exact inference #### 2.2. Student-t distribution <ul> <li>But <b>we know the Student \(t\) distributions</b> just as well as the standard normal distribution</li> <ul> <li>With 100 degrees of freedom, 95% of the distribution lie between \(\pm\) 1.98</li> <li>With 3,000 degrees of freedom, 90% of the distribution lie between \(\pm\) 1.65</li> </ul> </ul> <img src="slides_files/figure-html/unnamed-chunk-36-1.png" width="70%" style="display: block; margin: auto;" /> --- ### 2. Exact inference #### 2.2. Student-t distribution <ul> <li>So <b>instead of 1.96</b>, we must use the <b>value such that:</b></li> <ul> <li>The <b>desired percentage</b> of the distribution is comprised within \(\pm\) that value...</li> <li>For a Student \(t\) distribution with the relevant number of <b>degrees of freedom</b></li> </ul> </ul> <p style = "margin-bottom:1.25cm;"></p> -- <ul> <li>We can get these values easily with the <b>qt()</b> function, indicating:</li> <ul> <li></li> <li></li> </ul> </ul> ```r qt( , ) ``` --- ### 2. Exact inference #### 2.2. Student-t distribution <ul> <li>So <b>instead of 1.96</b>, we must use the <b>value such that:</b></li> <ul> <li>The <b>desired percentage</b> of the distribution is comprised within \(\pm\) that value...</li> <li>For a Student \(t\) distribution with the relevant number of <b>degrees of freedom</b></li> </ul> </ul> <p style = "margin-bottom:1.25cm;"></p> <ul> <li>We can get these values easily with the <b>qt()</b> function, indicating:</li> <ul> <li>The share of the distribution below the value we're looking for (e.g., 0.975 for a 95% CI)</li> <li></li> </ul> </ul> ```r qt(.975, ) ``` --- ### 2. Exact inference #### 2.2. Student-t distribution <ul> <li>So <b>instead of 1.96</b>, we must use the <b>value such that:</b></li> <ul> <li>The <b>desired percentage</b> of the distribution is comprised within \(\pm\) that value...</li> <li>For a Student \(t\) distribution with the relevant number of <b>degrees of freedom</b></li> </ul> </ul> <p style = "margin-bottom:1.25cm;"></p> <ul> <li>We can get these values easily with the <b>qt()</b> function, indicating:</li> <ul> <li>The share of the distribution below the value we're looking for (e.g., 0.975 for a 95% CI)</li> <li>The number of degrees of freedom of the Student \(t\) distribution (e.g., 88 observations - 2 parameters)</li> </ul> </ul> ```r qt(.975, 86) ``` -- ``` ## [1] 1.987934 ``` -- <p style = "margin-bottom:1.25cm;"></p> <ul> <li>Denote this value \(t(\text{df})_{1-\frac{\alpha}{2}}\)</li> <ul> <li>With \(\alpha\) equal to \(1 -\) the confidence level</li> <li>And \(\text{df}\) the number of degrees of freedom</li> </ul> </ul> --- ### 2. Exact inference #### 2.3. Confidence interval <ul> <li>The formula for the confidence interval in finite sample hence writes:</li> </ul> `$$\hat{\beta}\pm \color{SkyBlue}{t(\text{df})_{1-\frac{\alpha}{2}}}\times\text{se}(\hat{\beta})$$` -- <ul> <li>The confidence interval increases as:</li> <ul> <li>The confidence level increases</li> <li></li> </ul> </ul> <img src="slides_files/figure-html/unnamed-chunk-41-1.png" width="62%" style="display: block; margin: auto;" /> --- ### 2. Exact inference #### 2.3. Confidence interval <ul> <li>The formula for the confidence interval in finite sample hence writes:</li> </ul> `$$\hat{\beta}\pm \color{SkyBlue}{t(\text{df})_{1-\frac{\alpha}{2}}}\times\text{se}(\hat{\beta})$$` <ul> <li>The confidence interval increases as:</li> <ul> <li>The confidence level increases</li> <li>The number of degrees of freedom decreases</li> </ul> </ul> <img src="slides_files/figure-html/unnamed-chunk-42-1.png" width="62%" style="display: block; margin: auto;" /> --- class: inverse, hide-logo ### Practice #### 1) Import the `ggcurve.csv` dataset -- <p style = "margin-bottom:1cm;"></p> #### 2) Regress the IGE on the Gini coefficient and store the estimated regression parameters -- <p style = "margin-bottom:1cm;"></p> #### 3) Compute the 95% confidence interval of the regression slope <p style = "margin-bottom:1cm;"></p> `$$\hat{\beta}\pm t(\text{df})_{1-\frac{\alpha}{2}}\times\text{se}(\hat{\beta})$$` <p style = "margin-bottom:1cm;"></p> `$$\text{se}(\hat{\beta}) = \sqrt{\frac{\sum_{i = 1}^n\hat{\varepsilon_i}^2}{(n-\#\text{parameters})\sum_{i = 1}^n(x_i-\bar{x})^2}}$$` -- <p style = "margin-bottom:1.5cm;"></p> <center><h3><i>You've got 10 minutes!</i></h3></center>
−
+
10
:
00
--- class: inverse, hide-logo ### Solution #### 1) Import the `ggcurve.csv` dataset -- ```r ggcurve <- read.csv("C:/User/Documents/ggcurve.csv") ``` -- #### 2) Regress the IGE on the Gini coefficient and store the regression slope -- ```r model <- lm(ige ~ gini, ggcurve) model ``` ``` ## ## Call: ## lm(formula = ige ~ gini, data = ggcurve) ## ## Coefficients: ## (Intercept) gini ## -0.09129 1.01546 ``` -- ```r alpha <- model$coefficients[1] beta <- model$coefficients[2] ``` --- class: inverse, hide-logo ### Solution #### 3) Compute the 95% confidence interval of the regression slope ```r se_dat <- ggcurve %>% mutate(fit = alpha + gini * beta, e = ige - fit) %>% summarise(se = sqrt(sum(e^2)/((n()-2)*sum((gini-mean(gini))^2)))) se_dat$se ``` ``` ## [1] 0.2642477 ``` -- ```r beta - se_dat$se * qt(.975, nrow(ggcurve) - 2) ``` ``` ## gini ## 0.4642511 ``` ```r beta + se_dat$se * qt(.975, nrow(ggcurve) - 2) ``` ``` ## gini ## 1.566673 ``` --- <h3>Overview</h3> <p style = "margin-bottom:3cm;"></p> .pull-left[ <ul style = "margin-left:1.5cm;list-style: none"> <li><b>1. Asymptotic inference ✔</b></li> <ul style = "list-style: none"> <li>1.1. Data generating process</li> <li>1.2. Standardization</li> <li>1.3. Confidence interval</li> </ul> </ul> <p style = "margin-bottom:1cm;"></p> <ul style = "margin-left:1.5cm;list-style: none"> <li><b>2. Exact inference ✔</b></li> <ul style = "list-style: none"> <li>2.1. Standard error</li> <li>2.2. Student-t distribution</li> <li>2.3. Confidence interval</li> </ul> </ul> ] .pull-right[ <ul style = "margin-left:-1cm;list-style: none"> <li><b>3. Hypothesis testing</b></li> <ul style = "list-style: none"> <li>3.1. P-value</li> <li>3.2. linearHypothesis()</li> </ul> </ul> <p style = "margin-bottom:1.75cm;"></p> <ul style = "margin-left:-1cm;list-style: none"><li><b>4. Wrap up!</b></li></ul> ] --- <h3>Overview</h3> <p style = "margin-bottom:3cm;"></p> .pull-left[ <ul style = "margin-left:1.5cm;list-style: none"> <li><b>1. Asymptotic inference ✔</b></li> <ul style = "list-style: none"> <li>1.1. Data generating process</li> <li>1.2. Standardization</li> <li>1.3. Confidence interval</li> </ul> </ul> <p style = "margin-bottom:1cm;"></p> <ul style = "margin-left:1.5cm;list-style: none"> <li><b>2. Exact inference ✔</b></li> <ul style = "list-style: none"> <li>2.1. Standard error</li> <li>2.2. Student-t distribution</li> <li>2.3. Confidence interval</li> </ul> </ul> ] .pull-right[ <ul style = "margin-left:-1cm;list-style: none"> <li><b>3. Hypothesis testing</b></li> <ul style = "list-style: none"> <li>3.1. P-value</li> <li>3.2. linearHypothesis()</li> </ul> </ul> ] --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>We now have the <b>95% confidence interval</b> for our estimate:</li> <ul> <li>Our estimate of \(\beta\) is 1.02</li> <li>And we are 95% sure that \(\beta\) lies between 0.46 and 1.57</li> </ul> </ul> <p style = "margin-bottom:1cm;"></p> `$$0.46 \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, 1.02 \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, 1.57\\ |\underbrace{-----------}_{t(\text{df})_{1-\frac{\alpha}{2}}\times\text{se}(\hat{\beta})}\cdot\underbrace{-----------}_{t(\text{df})_{1-\frac{\alpha}{2}}\times\text{se}(\hat{\beta})}|$$` -- <p style = "margin-bottom:1cm;"></p> <ul> <li>Note that in our <b>confidence interval</b> formula:</li> <ul> <li>The <b>standard error</b> and the relevant Student \(t\) <b>distribution</b> are <b>given</b></li> <li>But the <b>confidence level</b> \(1 - \alpha\) was <b>chosen arbitrarily</b></li> </ul> </ul> -- <p style = "margin-bottom:1cm;"></p> ➜ Setting a <b>higher confidence</b> level would <b>widen the confidence interval</b> ➜ Allowing for a <b>lower confidence</b> level would <b>narrow the confidence interval</b> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>So far we framed the problem as:</li> </ul> <center><i>"What are the values \(\beta\) is likely to take under a given confidence level?"</i></center> <p style = "margin-bottom:1.5cm;"></p> -- <ul> <li>But we could also think of it as:</li> </ul> <center><i>"Under which confidence level is \(\beta\) is likely to take a given value?"</i></center> -- <p style = "margin-bottom:1.5cm;"></p> <ul> <li>And this is actually a very <b>practical way of framing the question:</b></li> <ul> <li>To (in)validate the predictions from a theoretical model</li> <li>To know under which confidence level \(\beta\) is likely to be \(\neq 0\) at all</li> </ul> </ul> -- <p style = "margin-bottom:1.5cm;"></p> <center><b><i>➜ But how to answer such questions in practice?</i></b></center> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>We can start from the fact that even though we do not know \(\beta\), we know that:</li> </ul> $$\frac{\hat{\beta} - \beta}{\text{se}(\hat{\beta})} \sim t(\text{df}) $$ <img src="slides_files/figure-html/unnamed-chunk-49-1.png" width="65%" style="display: block; margin: auto;" /> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>And that in this distribution some values are quite plausible:</li> </ul> `$$\frac{\hat{\beta} - \color{SkyBlue}{1}}{\text{se}(\hat{\beta})}$$` <img src="slides_files/figure-html/unnamed-chunk-50-1.png" width="65%" style="display: block; margin: auto;" /> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>And some are way less plausible:</li> </ul> `$$\frac{\hat{\beta} - \color{SkyBlue}{2.5}}{\text{se}(\hat{\beta})}$$` <img src="slides_files/figure-html/unnamed-chunk-51-1.png" width="65%" style="display: block; margin: auto;" /> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>But because the distribution is <b>continuous:</b></li> <ul> <li></li> <li></li> </ul> </ul> <p style = "margin-bottom:1.25cm"></p> <img src="slides_files/figure-html/unnamed-chunk-52-1.png" width="65%" style="display: block; margin: auto;" /> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>But because the distribution is <b>continuous:</b></li> <ul> <li>The <b>probability</b> to draw any <b>exact value</b> would be <b>0</b></li> <li></li> </ul> </ul> <p style = "margin-bottom:1.25cm"></p> <img src="slides_files/figure-html/unnamed-chunk-53-1.png" width="65%" style="display: block; margin: auto;" /> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>But because the distribution is <b>continuous:</b></li> <ul> <li>The <b>probability</b> to draw any <b>exact value</b> would be <b>0</b></li> <li>We can only compute the <b>probability to fall below</b> that value</li> </ul> </ul> <p style = "margin-bottom:1.25cm"></p> <img src="slides_files/figure-html/unnamed-chunk-54-1.png" width="65%" style="display: block; margin: auto;" /> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>But because the distribution is <b>continuous:</b></li> <ul> <li>The <b>probability</b> to draw any <b>exact value</b> would be <b>0</b></li> <li><b>Or to fall above</b> that value <b>if</b> it is <b>negative</b></li> </ul> </ul> <p style = "margin-bottom:1.25cm"></p> <img src="slides_files/figure-html/unnamed-chunk-55-1.png" width="65%" style="display: block; margin: auto;" /> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>But <b>generally</b> what makes sense is to know what are the <b>chances to fall <i>that far</i> from 0:</b></li> <ul> <li></li> <li></li> </ul> </ul> <p style = "margin-bottom:1.25cm"></p> <img src="slides_files/figure-html/unnamed-chunk-56-1.png" width="65%" style="display: block; margin: auto;" /> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>But <b>generally</b> what makes sense is to know what are the <b>chances to fall <i>that far</i> from 0:</b></li> <ul> <li>So we take 1 - the probability to fall below the absolute value</li> <li></li> </ul> </ul> <p style = "margin-bottom:1.25cm"></p> <img src="slides_files/figure-html/unnamed-chunk-57-1.png" width="65%" style="display: block; margin: auto;" /> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>But <b>generally</b> what makes sense is to know what are the <b>chances to fall <i>that far</i> from 0:</b></li> <ul> <li>So we take 1 - the probability to fall below the absolute value</li> <li>And we multiply it by 2</li> </ul> </ul> <p style = "margin-bottom:1.25cm"></p> <img src="slides_files/figure-html/unnamed-chunk-58-1.png" width="65%" style="display: block; margin: auto;" /> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>The <b>resulting area</b> is what we call a <b>p-value</b></li> <ul> <li>It is the probability that \(\beta\) falls at least as far from \(\hat{\beta}\) as the hypothesized value</li> </ul> </ul> <p style = "margin-bottom:1.25cm"></p> -- <ul> <li>Consider finding \(\hat{\beta} = 4\) and a hypothesizing value of 3 for \(\beta\)</li> <ul> <li>A p-value of 5% indicates that there is only a 5% chance to find a \(\hat{\beta} = 4\) if \(\beta = 3\)</li> <li>Below that threshold we would reject the hypothesis that \(\beta = 3\) at the 95% confidence level</li> </ul> </ul> <p style = "margin-bottom:1.25cm"></p> -- <ul> <li>Notice that in this example, the 95% confidence interval of \(\hat{\beta}\) would not include the value 3</li> <ul> <li>With a hypothesized value equal to the bound of a confidence interval the p-value would equal 1 - the corresponding confidence level</li> <li>So a p-value lower than \(\alpha\) means that the hypothesized value is outside the \((1-\alpha)\)% confidence interval</li> </ul> </ul> <p style = "margin-bottom:1.25cm"></p> <center><i>➜ Let's go through a formal example with our data</i></center> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>Can we <b>reject</b> at the 95% confidence level that \(\beta = 0\)?</li> </ul> ```r beta ``` ``` ## gini ## 1.015462 ``` -- <ul> <li>We should start by hypothesizing that \(\beta = 0\)</li> <ul> <li>This is what we call the <b><i>"null hypothesis"</i></b> \(H_0\)</li> </ul> </ul> <p style = "margin-bottom:1.25cm"></p> `$$H_0: \beta = 0$$` -- <center>Under \(H_0\):</center> `$$\frac{\hat{\beta} - 0}{\text{se}(\hat{\beta})} \sim t(\text{df})$$` --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>We should find the <b>area below</b> \((\hat{\beta} - 0)/\text{se}(\hat{\beta})\) in a Student \(t\) distribution we the right number of \(\text{df}\)</li> <ul> <li>\((\hat{\beta} - 0)/\text{se}(\hat{\beta})\) is what we call <b>the \(t\)-stat</b></li> </ul> </ul> -- ```r (beta - 0) / se_dat$se ``` ``` ## gini ## 3.842842 ``` <p style = "margin-bottom:1.25cm"></p> -- <ul> <li>While <b>qt()</b> gave us the <b>value</b> for a certain probability, <b>pt()</b> gives the the <b>probability</b> for a given value:</li> <ul> <li></li> <li></li> </ul> </ul> ```r pt( , ) ``` --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>We should find the <b>area below</b> \((\hat{\beta} - 0)/\text{se}(\hat{\beta})\) in a Student \(t\) distribution we the right number of \(\text{df}\)</li> <ul> <li>\((\hat{\beta} - 0)/\text{se}(\hat{\beta})\) is what we call <b>the \(t\)-stat</b></li> </ul> </ul> ```r (beta - 0) / se_dat$se ``` ``` ## gini ## 3.842842 ``` <p style = "margin-bottom:1.25cm"></p> <ul> <li>While <b>qt()</b> gave us the <b>value</b> for a certain probability, <b>pt()</b> gives the the <b>probability</b> for a given value:</li> <ul> <li>Put in the <b>t-stat</b></li> <li></li> </ul> </ul> ```r pt((beta - 0) / se_dat$se, ) ``` --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>We should find the <b>area below</b> \((\hat{\beta} - 0)/\text{se}(\hat{\beta})\) in a Student \(t\) distribution we the right number of \(\text{df}\)</li> <ul> <li>\((\hat{\beta} - 0)/\text{se}(\hat{\beta})\) is what we call <b>the \(t\)-stat</b></li> </ul> </ul> ```r (beta - 0) / se_dat$se ``` ``` ## gini ## 3.842842 ``` <p style = "margin-bottom:1.25cm"></p> <ul> <li>While <b>qt()</b> gave us the <b>value</b> for a certain probability, <b>pt()</b> gives the the <b>probability</b> for a given value:</li> <ul> <li>Put in the <b>t-stat</b></li> <li>And the <b>degrees of freedom</b></li> </ul> </ul> ```r pt((beta - 0) / se_dat$se, nrow(ggcurve) - 2) ``` ``` ## gini ## 0.9994921 ``` --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>We must then:</li> <ul> <li>Take <b>1 - this probability</b> (area above the t-stat)</li> <li></li> </ul> </ul> ```r 1 - pt(abs((beta - 0) / se_dat$se), nrow(ggcurve) - 2) ``` ``` ## gini ## 0.0005078528 ``` --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>We must then:</li> <ul> <li>Take <b>1 - this probability</b> (area above the t-stat)</li> <li>And <b>multiply it by 2</b> (consider the absolute distance and not the signed distance)</li> </ul> </ul> ```r 2 * (1 - pt(abs((beta - 0) / se_dat$se), nrow(ggcurve) - 2)) ``` ``` ## gini ## 0.001015706 ``` -- <p style = "margin-bottom:1cm"></p> <ul> <li>The <b>p-value</b> is <b>lower than 1%:</b></li> <ul> <li>We can <b>reject at the 99% confidence level</b> that \(\beta = 0\)</li> <li>In that case we say that \(\hat{\beta}\) is <b>significantly different from 0</b> at the 1% significance level</li> </ul> </ul> -- <p style = "margin-bottom:1cm"></p> <ul> <li>But the <b>p-value</b> is <b>greater than 0.1%:</b></li> <ul> <li>We <b>cannot reject at the 99.9% confidence level</b> that \(\beta = 0\)</li> <li>In that case we say that \(\hat{\beta}\) is <b>not significantly different from 0</b> at the 0.1% significance level</li> </ul> </ul> --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>By default, the <b>summary()</b> function <b>tests</b> whether or not each coefficient is significantly <b>different from 0</b></li> <ul> <li></li> </ul> </ul> ```r summary(lm(ige ~ gini, ggcurve)) ``` --- ### 3. Hypothesis testing #### 3.1. P-value <ul> <li>By default, the <b>summary()</b> function <b>tests</b> whether or not each coefficient is significantly <b>different from 0</b></li> <ul> <li>You can <b>extract</b> the information from the <b>$coefficient</b> attribute of the output</li> </ul> </ul> ```r summary(lm(ige ~ gini, ggcurve))$coefficients ``` ``` ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -0.09129311 0.1287045 -0.7093234 0.486311455 ## gini 1.01546204 0.2642477 3.8428420 0.001015706 ``` -- <p style = "margin-bottom:.8cm"></p> <ul> <li>For each coefficient it indicates:</li> <ul> <li>The standard error</li> <li>The \(t\)-stat \((H_0:\beta=0)\)</li> <li>The p-value \((H_0:\beta=0)\)</li> </ul> </ul> -- <ul> <li>The output of the <b>summary()</b> function is great to have a <b>quick overview</b> of the model:</li> </ul> ```r summary(lm(ige ~ gini, ggcurve)) ``` --- ### 3. Hypothesis testing #### 3.1. P-value <p style = "margin-bottom:1.25cm"></p> .left-column[ <p style = "margin-bottom:-.53cm;"></p> ``` ## ## Call: ## lm(formula = ige ~ gini, data = ggcurve) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.188991 -0.088238 -0.000855 0.047284 0.252310 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -0.09129 0.12870 -0.709 0.48631 ## gini 1.01546 0.26425 3.843 0.00102 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.1159 on 20 degrees of freedom ## Multiple R-squared: 0.4247, Adjusted R-squared: 0.396 ## F-statistic: 14.77 on 1 and 20 DF, p-value: 0.001016 ``` ] --- ### 3. Hypothesis testing #### 3.1. P-value <p style = "margin-bottom:1.25cm"></p> .left-column[ <p style = "margin-bottom:-.53cm;"></p> ``` ## ## Call: ## lm(formula = ige ~ gini, data = ggcurve) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.188991 -0.088238 -0.000855 0.047284 0.252310 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -0.09129 0.12870 -0.709 0.48631 ## gini 1.01546 0.26425 3.843 0.00102 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.1159 on 20 degrees of freedom ## Multiple R-squared: 0.4247, Adjusted R-squared: 0.396 ## F-statistic: 14.77 on 1 and 20 DF, p-value: 0.001016 ``` ] .right-column[ <b>🠄</b> Command ] --- ### 3. Hypothesis testing #### 3.1. P-value <p style = "margin-bottom:1.25cm"></p> .left-column[ <p style = "margin-bottom:-.53cm;"></p> ``` ## ## Call: ## lm(formula = ige ~ gini, data = ggcurve) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.188991 -0.088238 -0.000855 0.047284 0.252310 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -0.09129 0.12870 -0.709 0.48631 ## gini 1.01546 0.26425 3.843 0.00102 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.1159 on 20 degrees of freedom ## Multiple R-squared: 0.4247, Adjusted R-squared: 0.396 ## F-statistic: 14.77 on 1 and 20 DF, p-value: 0.001016 ``` ] .right-column[ <b>🠄</b> Command <p style = "margin-bottom:1.5cm;"></p> <b>🠄</b> Residuals distribution ] --- ### 3. Hypothesis testing #### 3.1. P-value <p style = "margin-bottom:1.25cm"></p> .left-column[ <p style = "margin-bottom:-.53cm;"></p> ``` ## ## Call: ## lm(formula = ige ~ gini, data = ggcurve) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.188991 -0.088238 -0.000855 0.047284 0.252310 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -0.09129 0.12870 -0.709 0.48631 ## gini 1.01546 0.26425 3.843 0.00102 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.1159 on 20 degrees of freedom ## Multiple R-squared: 0.4247, Adjusted R-squared: 0.396 ## F-statistic: 14.77 on 1 and 20 DF, p-value: 0.001016 ``` ] .right-column[ <b>🠄</b> Command <p style = "margin-bottom:1.5cm;"></p> <b>🠄</b> Residuals distribution <p style = "margin-bottom:1.75cm;"></p> <b>🠄</b> Coefs, s.e., t-/p-values ] --- ### 3. Hypothesis testing #### 3.1. P-value <p style = "margin-bottom:1.25cm"></p> .left-column[ <p style = "margin-bottom:-.53cm;"></p> ``` ## ## Call: ## lm(formula = ige ~ gini, data = ggcurve) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.188991 -0.088238 -0.000855 0.047284 0.252310 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -0.09129 0.12870 -0.709 0.48631 ## gini 1.01546 0.26425 3.843 0.00102 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.1159 on 20 degrees of freedom ## Multiple R-squared: 0.4247, Adjusted R-squared: 0.396 ## F-statistic: 14.77 on 1 and 20 DF, p-value: 0.001016 ``` ] .right-column[ <b>🠄</b> Command <p style = "margin-bottom:1.5cm;"></p> <b>🠄</b> Residuals distribution <p style = "margin-bottom:1.75cm;"></p> <b>🠄</b> Coefs, s.e., t-/p-values <p style = "margin-bottom:1.25cm;"></p> <b>🠄</b> Significance ] --- ### 3. Hypothesis testing #### 3.1. P-value <p style = "margin-bottom:1.25cm"></p> .left-column[ <p style = "margin-bottom:-.53cm;"></p> ``` ## ## Call: ## lm(formula = ige ~ gini, data = ggcurve) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.188991 -0.088238 -0.000855 0.047284 0.252310 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -0.09129 0.12870 -0.709 0.48631 ## gini 1.01546 0.26425 3.843 0.00102 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.1159 on 20 degrees of freedom ## Multiple R-squared: 0.4247, Adjusted R-squared: 0.396 ## F-statistic: 14.77 on 1 and 20 DF, p-value: 0.001016 ``` ] .right-column[ <b>🠄</b> Command <p style = "margin-bottom:1.5cm;"></p> <b>🠄</b> Residuals distribution <p style = "margin-bottom:1.75cm;"></p> <b>🠄</b> Coefs, s.e., t-/p-values <p style = "margin-bottom:1.25cm;"></p> <b>🠄</b> Significance <p style = "margin-bottom:1cm;"></p> <b>🠄</b> df and advanced stats ] --- ### 3. Hypothesis testing #### 3.2. linearHypothesis() <ul> <li>But the <b>linearHypothesis()</b> function from the <b>car</b> package allows to <b>easily test</b> other <b>hypotheses:</b></li> <ul> <li></li> <li></li> </ul> </ul> ```r linearHypothesis( , ) ``` --- ### 3. Hypothesis testing #### 3.2. linearHypothesis() <ul> <li>But the <b>linearHypothesis()</b> function from the <b>car</b> package allows to <b>easily test</b> other <b>hypotheses:</b></li> <ul> <li>You must provide the <b>model</b></li> <li></li> </ul> </ul> ```r linearHypothesis(lm(ige ~ gini, ggcurve), ) ``` --- ### 3. Hypothesis testing #### 3.2. linearHypothesis() <ul> <li>But the <b>linearHypothesis()</b> function from the <b>car</b> package allows to <b>easily test</b> other <b>hypotheses:</b></li> <ul> <li>You must provide the <b>model</b></li> <li>And the <b>hypothesis</b> (referring to coefficients as in the summary)</li> </ul> </ul> ```r linearHypothesis(lm(ige ~ gini, ggcurve), "gini = 0") ``` -- ``` ## Linear hypothesis test ## ## Hypothesis: ## gini = 0 ## ## Model 1: restricted model ## Model 2: ige ~ gini ## ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 21 0.46733 ## 2 20 0.26883 1 0.1985 14.767 0.001016 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` --- ### 3. Hypothesis testing #### 3.2. linearHypothesis() <ul> <li>You can also test <b>more complex hypotheses</b></li> <ul> <li>Like equality between coefficients</li> </ul> </ul> ```r linearHypothesis(lm(ige ~ gini, ggcurve), "gini = (Intercept)") ``` ``` ## Linear hypothesis test ## ## Hypothesis: ## - (Intercept) + gini = 0 ## ## Model 1: restricted model ## Model 2: ige ~ gini ## ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 21 0.37634 ## 2 20 0.26883 1 0.10751 7.9983 0.01039 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` --- ### 3. Hypothesis testing #### 3.2. linearHypothesis() <ul> <li>You can also test <b>more complex hypotheses</b></li> <ul> <li>Like equality between coefficients, or joint hypotheses (relying a generalization of the t-test called <i>F-test</i>)</li> </ul> </ul> ```r linearHypothesis(lm(ige ~ gini, ggcurve), c("gini = 0", "(Intercept) = 0")) ``` ``` ## Linear hypothesis test ## ## Hypothesis: ## gini = 0 ## (Intercept) = 0 ## ## Model 1: restricted model ## Model 2: ige ~ gini ## ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 22 3.8841 ## 2 20 0.2688 2 3.6153 134.48 2.523e-12 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` --- <h3>Overview</h3> <p style = "margin-bottom:3cm;"></p> .pull-left[ <ul style = "margin-left:1.5cm;list-style: none"> <li><b>1. Asymptotic inference ✔</b></li> <ul style = "list-style: none"> <li>1.1. Data generating process</li> <li>1.2. Standardization</li> <li>1.3. Confidence interval</li> </ul> </ul> <p style = "margin-bottom:1cm;"></p> <ul style = "margin-left:1.5cm;list-style: none"> <li><b>2. Exact inference ✔</b></li> <ul style = "list-style: none"> <li>2.1. Standard error</li> <li>2.2. Student-t distribution</li> <li>2.3. Confidence interval</li> </ul> </ul> ] .pull-right[ <ul style = "margin-left:-1cm;list-style: none"> <li><b>3. Hypothesis testing ✔</b></li> <ul style = "list-style: none"> <li>3.1. P-value</li> <li>3.2. linearHypothesis()</li> </ul> </ul> <p style = "margin-bottom:1.75cm;"></p> <ul style = "margin-left:-1cm;list-style: none"><li><b>4. Wrap up!</b></li></ul> ] --- ### 4. Wrap up! #### Data generating process <ul> <li>In practice we estimate coefficients on a <b>given realization of a data generating process</b></li> <ul> <li>So the <b>true coefficient</b> is <b>unobserved</b></li> <li>But our <b>estimation</b> is <b>informative</b> on the values the true coefficient is likely to take</li> </ul> </ul> .left-column[ <img src="slides_files/figure-html/unnamed-chunk-84-1.png" width="90%" style="display: block; margin: auto auto auto 0;" /> ] .right-column[ <p style = "margin-bottom:3cm"></p> `$$\frac{\hat{\beta}-\beta}{\text{SD}(\hat{\beta})} \sim \mathcal{N}(0, 1)$$` ] --- ### 4. Wrap up! #### Confidence interval <ul> <li>This allows to infer a <b>confidence interval:</b></li> </ul> `$$\hat{\beta}\pm t(\text{df})_{1-\frac{\alpha}{2}}\times\text{se}(\hat{\beta})$$` <p style = "margin-bottom:1.5cm;"></p> -- <ul> <li>Where \(t(\text{df})_{1-\frac{\alpha}{2}}\) is the value from a <b>Student \(t\) distribution</b></li> <ul> <li>With the relevant number of <b>degrees of freedom</b> \(\text{df}\) (n - #parameters)</li> <li>And the desired <b>confidence level</b> \(1-\alpha\)</li> </ul> </ul> <p style = "margin-bottom:1.5cm;"></p> -- <ul> <li>And where \(\text{se}(\hat{\beta})\) denotes the <b>standard error</b> of \(\hat{\beta}\):</li> </ul> `$$\text{se}(\hat{\beta}) = \sqrt{\widehat{\text{Var}(\hat{\beta})}} = \sqrt{\frac{\sum_{i = 1}^n\hat{\varepsilon_i}^2}{(n-\#\text{parameters})\sum_{i = 1}^n(x_i-\bar{x})^2}}$$` --- ### 4. Wrap up! #### P-value <ul> <li>It also allows to <b>test</b> how likely is \(\beta\) to be <b>different from a given value:</b></li> <ul> <li>If the <b>p-value</b> < 5%, we can <b>reject</b> that \(\beta\) equals the <b>hypothesized value</b> at the 95% confidence level</li> <li>This threshold, very common in Economics, implies that we have 1 chance out of 20 to be wrong</li> </ul> </ul> -- ```r linearHypothesis(lm(ige ~ gini, ggcurve), "gini = 0") ``` ``` ## Linear hypothesis test ## ## Hypothesis: ## gini = 0 ## ## Model 1: restricted model ## Model 2: ige ~ gini ## ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 21 0.46733 ## 2 20 0.26883 1 0.1985 14.767 0.001016 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ```