Interpretation

class: center, middle, inverse, title-slide

.title[
# Interpretation
]
.subtitle[
## Lecture 12
]
.author[
### Louis SIRUGUE
]
.date[
### CPES 2 - Fall 2023
]

---

### Quick reminder

#### Omitted variable bias

<ul>
 <li>If a third variable is correlated with both $x$ and $y$, it would bias the relationship</li>
 <ul>
 <li>We must then control for such variables</li>
 <li>And if we can't we must acknowledge that our estimate is not causal with 'ceteris paribus'</li>
 </ul>
</ul>

---

### Quick reminder

#### Functional form

<ul>
 <li>Not capturing the right functional form correctly might also lead to biased estimations:</li>
 <ul>
 <li>Polynomial order, interactions, logs, discretization matter</li>
 <li>Visualizing the relationship is key</li>
 </ul>
</ul>

---

### Quick reminder

#### Selection bias

<ul>
 <li>Self-selection is also a common threat to causality</li>
</ul>

<ul>
 <li>What is the impact of going to a better neighborhood on your children outcomes?</li>
 <ul>
 <li>We cannot just regress children outcomes on a mobility dummy</li>
 <li>Individuals who move may be different from those who stay: self-selection issue</li>
 <li>Here it is not that the sample is not representative of the population, but that the outcomes of those who stayed are different from the outcomes those who moved would have had, if they had stayed</li>
 </ul>
</ul>

#### Simultaneity

<ul>
 <li>Consider the relationship between crime rate and police coverage intensity</li>
</ul>

<ul>
 <li>What is the direction of the relationship?</li>
 <ul>
 <li>We cannot just regress crime rate on police intensity</li>
 <li>It's likely that more crime would cause a positive response in police activity</li>
 <li>And also that police activity would deter crime</li>
 </ul>
</ul>

---

### Quick reminder

#### Measurement error

<ul>
 <li>Measurement error in the independent variable also induces a bias</li>
 <ul>
 <li>The resulting estimation would mechanically be downward biased</li>
 <li>The noisier the measure, the larger the bias</li>
 </ul>
</ul>

---

### Quick reminder

#### Randomized Controlled Trials

<ul>
 <li>A Randomized Controlled Trial (RCT) is a type of experiment in which the thing we want to know the impact of (called the treatment) is randomly allocated in the population</li>
 <ul>
 <li>The two groups would then have the same characteristics on expectation, and would be comparable</li>
 <li>It is a way to obtain causality from randomness</li>
 </ul>
</ul>

<ul>
 <li>RCTs are very powerful tools to sort out issues of:</li>
 <ul>
 <li>Omitted variables</li>
 <li>Selection bias</li>
 <li>Simultaneity</li>
 </ul>
</ul>

<ul>
 <li>But RCTs are not immune to every problem:</li>
 <ul>
 <li>The sample must be representative and large enough</li>
 <li>Participants should comply with their treatment status</li>
 <li>Independent variables must not be noisy measures of the variable of interest</li>
 <li>...</li>
 </ul>
</ul>

---

<h3>Today: Interpretation</h3>

.pull-left[

<ul style = "margin-left:1.5cm;list-style: none">
 <li>1. Point estimates</li>
 <ul style = "list-style: none">
 <li>1.1. Continuous variables</li>
 <li>1.2. Discrete variables</li>
 <li>1.3. Log vs. level</li>
 </ul>
</ul>

<ul style = "margin-left:1.5cm;list-style: none">
 <li>2. Practice interpretation</li>
</ul> 
]

.pull-right[
<ul style = "margin-left:-1cm;list-style: none">
 <li>3. Regression tables</li>
 <ul style = "list-style: none">
 <li>3.1. Layout</li>
 <li>3.2. Reported significance</li>
 <li>3.3. R squared</li>
 </ul>
</ul>

<ul style = "margin-left:-1cm;list-style: none"><li>4. Wrap up!</li></ul>
]

---

<h3>Today: Interpretation</h3>

.pull-left[

---

### 1. Point estimates

#### 1.1. Continuous variables

<ul>
 <li>In this first part, we're going to consider the relationship between:</li>
 <ul>
 <li>The income level of young parents</li>
 <li>The health of their newborn</li>
 </ul>
</ul>

<ul>
 <li>Consider first the following specification of the two variables:</li>
 <ul>
 <li>A continuous measure of annual household income in euros</li>
 <li>A continuous measure of birth weight in grams</li>
 </ul>
</ul>

`$$\text{Birth weight}_i = \alpha + \beta \times \text{Household income}_i  + \varepsilon_i$$`

```r
lm(birth_weight ~ household_income, data)$coefficients
```

```
##      (Intercept) household_income 
##     3.134528e+03     2.213871e-03
```

<center>&#10140; How would you interpret $\hat{\beta}$ here? (Note that e+03 and e-03 mean $\times 10^3$ and $\times 10^{-3}$)</center>

---

### 1. Point estimates

#### 1.1. Continuous variables

<ul>
 <li>When both $x$ and $y$ are continuous, the general template for the interpretation of $\hat{\beta}$ is:</li>
</ul>

<center>"Everything else equal, a 1 [unit] increase in [x] is associated with an [in/de]crease of [beta] [units] in [y] on average."</center>

<ul>
 <li>So in our case the adequate interpretation would be:</li>
</ul>

<center>"Everything else equal, a 1 euro increase in annual household income is associated with an increase of 0.002 gram in newborn birth weight on average."</center>

<ul>
 <li>But it would be even better to interpret the results for a meaningful variation of $x$</li>
 <ul>
 <li>For an annual household income, a 1 euro variation is not really meaningful</li>
 <li>1 euro increase &#10140; 0.002 gram increase $\Leftrightarrow$ 1,000 euro increase &#10140; 2 gram increase</li>
 </ul>
</ul>

---

### 1. Point estimates

#### 1.1. Continuous variables

<ul>
 <li>A common way to obtain a coefficient for a meaningful variation of $x$ is to standardize $x$</li>
 <ul>
 <li>If we divide $x$ by $\text{SD}(x)$, the 1 unit increase in $\frac{x}{\text{SD}(x)}$ is equivalent to an $\text{SD}(x)$ increase in $x$</li>
 <li>An $\text{SD}(x)$ change in $x$ is meaningful: it's low if $x$ is very concentrated and high if $x$ is highly spread out</li>
 </ul>
</ul>

---

### 1. Point estimates

#### 1.1. Continuous variables

---

### 1. Point estimates

#### 1.1. Continuous variables

<ul>
 <li>Note that if you standardize both $x$ and $y$, the resulting $\hat\beta$ equals the correlation between $x$ and $y$</li>
 <ul>
 <li>To show that, let's first rewrite the formula of the beta coefficient:</li>
 </ul>
</ul>

`$$\hat{\beta} = \frac{\text{Cov}(x, \, y)}{\text{Var}(x)} = \frac{\text{Cov}(x, \, y)}{\text{SD}(x)\times\text{SD}(x)}$$`

`$$\hat{\beta} = \frac{\text{Cov}(x, \, y)}{\text{SD}(x)\times\text{SD}(x)} \times \frac{\text{SD}(y)}{\text{SD}(y)}$$`

`$$\hat{\beta} = \frac{\text{Cov}(x, \, y)}{\text{SD}(x)\times\text{SD}(y)} \times \frac{\text{SD}(y)}{\text{SD}(x)}$$`

`$$\hat{\beta} = \text{Cor}(x, \, y) \times \frac{\text{SD}(y)}{\text{SD}(x)}$$`

---

### 1. Point estimates

#### 1.1. Continuous variables

<ul>
 <li>Starting with the previous expression, the $\hat\beta$ coefficient with the standardized variables writes:</li>
</ul>

`$$\hat{\beta} = \frac{\text{Cov}\big(\frac{x}{\text{SD}(x)}, \, \frac{y}{\text{SD}(y)}\big)}{\text{SD}\big(\frac{x}{\text{SD}(x)}\big)\times\text{SD}\big(\frac{y}{\text{SD}(y)}\big)} \times \frac{\text{SD}\big(\frac{y}{\text{SD}(y)}\big)}{\text{SD}\big(\frac{x}{\text{SD}(x)}\big)}$$`

<ul>
 <li>But by construction, the standard deviation of a standardized variable is 1:</li>
</ul>

.pull-left[

`$$\hat{\beta} = \frac{\text{Cov}\big(\frac{x}{\text{SD}(x)}, \, \frac{y}{\text{SD}(y)}\big)}{1\times1} \times \frac{1}{1}$$`

`$$\hat{\beta} = \text{Cov}\bigg(\frac{x}{\text{SD}(x)}, \, \frac{y}{\text{SD}(y)}\bigg)$$`

`$$\hat{\beta} = \frac{\text{Cov}(x, \, y)}{\text{SD}(x)\times\text{SD}(y)} = \text{Cor}(x, y)$$`
]

.pull-right[

<ul>
 <li>Learn the cheatsheet on moments properties:</li>
</ul>

<center><a href = "https://louissirugue.github.io/metrics_on_R/cheatsheets/moments.pdf"><img src = "moments.png" width = "150"/></a>
]
---

### 1. Point estimates

#### 1.2. Discrete variables

<ul>
 <li>Consider the following specification of the two variables:</li>
 <ul>
 <li>A categorical variable for annual household income divided in terciles</li>
 <li>Still a continuous measure of birth weight in grams</li>
 </ul>
</ul>

`$$\text{Birth weight}_i = \alpha + \beta_1 \text{T2}_i + \beta_2 \text{T3}_i  + \varepsilon_i$$`

<ul>
 <li>Recall that when including a categorical variable in a regression, a reference category must be omitted</li>
</ul>

```r
lm(birth_weight ~ income_tercile, data)$coefficients
```

```
##      (Intercept) income_tercileT2 income_tercileT3 
##       3112.83162         88.24778        222.65414
```

<center>&#10140; How would you interpret $\hat{\beta}_1$ and $\hat{\beta}_2$ here?</center>

---

### 1. Point estimates

#### 1.2. Discrete variables

<ul>
 <li>With a discrete $x$, the interpretation of the coefficient must be relative to the reference category:</li>
</ul>

<center>"Everything else equal, belonging to the [x category] is associated with a [beta] [unit] [higher/lower] average [y] relative to the [reference category]."</center>

<ul>
 <li>So in our case, the adequate interpretations would be:</li>
</ul>

<center>"Everything else equal, belonging to the second income tercile is associated with a 88 grams higher average birth weight relative to the first income tercile."</center>

<center>"Everything else equal, belonging to the third income tercile is associated with a 223 grams higher average birth weight relative to the first income tercile."</center>

<ul>
 <li>And the intercept is the average birth weight for newborns to parents in the first income tercile</li>
</ul>

---

### 1. Point estimates

#### 1.2. Discrete variables

<ul>
 <li>Consider now the following specification of the two variables:</li>
 <ul>
 <li>A continuous measure of annual household income in euros</li>
 <li>A binary variable taking the value 1 if the newborn is underweight and 0 otherwise</li>
 </ul>
</ul>

`$$\text{Underweight}_i = \alpha + \beta \times \text{Household income}_i  + \varepsilon_i$$`

```r
lm(underweight ~ household_income, data)$coefficients
```

```
##      (Intercept) household_income 
##     5.214013e-02    -4.084787e-07
```

<center>&#10140; How would you interpret $\hat{\beta}$ here?</center>

<center>&#10140; And would you consider its magnitude high?</center>

---

### 1. Point estimates

#### 1.2. Discrete variables

<ul>
 <li>With a binary $y$ variable, the coefficient must be interpreted in percentage points:</li>
</ul>

<center>"Everything else equal, a 1 [unit] increase in [x] is associated with a [beta $\times$ 100] percentage point [in/de]crease in the probability that [y equals 1] on average."</center>

<ul>
 <li>So in our case, the adequate interpretation would be:</li>
</ul>

<center>"Everything else equal, a 1 euro increase in annual household income is associated with a 0.00004 percentage point decrease in the probability that the newborn is underweight on average."</center>

<ul>
 <li>Here the interpretation would be more meaningful:</li>
 <ul>
 <li>For a 1,000 euro increase &#10140; 0.04 percentage point decrease</li>
 <li>Compared to the typical probability to have an underweight newborn</li>
 </ul>
</ul>

---

### 1. Point estimates

#### 1.2. Discrete variables

<ul>
 <li>The mean of a dummy variable corresponds to the share of 1s:</li>
</ul>

```r
mean(data$underweight)
```

```
## [1] 0.037
```

<ul>
 <li>We can also compute the probability that $y = 1$ for the average $x$ with our estimated coefficients:</li>
</ul>

```r
5.214013e-02 + mean(data$household_income) * -4.084787e-07
```

```
## [1] 0.03700001
```

<center>For the average household, a 1,000 euro increase in annual income would be associated with a 0.0004 / 0.037 $\approx$ 1% decrease in the probability that the newborn is underweight</center>

---

### 1. Point estimates

#### 1.2. Discrete variables

<ul>
 <li>Finally if both the $y$ and the $x$ variables are discrete, the coefficient must be interpreted:</li>
 <ul>
 <li>In percentage points</li>
 <li>Relative to the reference category</li>
 </ul>
</ul>

`$$\text{Underweight}_i = \alpha + \beta_1 \text{T2}_i + \beta_2 \text{T3}_i  + \varepsilon_i$$`

```r
lm(underweight ~ income_tercile, data)$coefficients
```

```
##      (Intercept) income_tercileT2 income_tercileT3 
##       0.07207207      -0.03903904      -0.06608405
```

--

<center>"Everything else equal, belonging to the second income tercile is associated with a 3.9 percentage point lower probability that the newborn is underweight relative to the first income tercile."</center>

---

### 1. Point estimates

#### 1.3. Log vs. level

<ul>
 <li>Consider now the following hypothetical relationship:</li>
</ul>

.pull-left[

<img src="slides_files/figure-html/unnamed-chunk-16-1.png" width="100%" style="display: block; margin: auto;" />
]

.pull-right[

<ul>
 <li>The slope tells us by how many units the $y$ variable would increase for a 1 unit increase in $x$</li>
</ul>

<ul>
 <li>But often times in Economics we're interested in the elasticity between the two variables:</li>
 <ul>
 <li>What is the expected percentage change in $y$ for a one percent increase in $x$?</li>
 </ul>
</ul>

<center>&#10140; The log transformation can be used to easily get an approximation of that</center>

]

---

### 1. Point estimates

#### 1.3. Log vs. level

.pull-left[

* Instead of considering

`$$y_i = \alpha_{lvl} + \beta_{lvl}x_i + \varepsilon_i$$`

]

.pull-right[

* We consider

`$$\log(y_i) = \alpha_{log} + \beta_{log}\log(x_i) + \varepsilon_i$$`

]

---

### 1. Point estimates

#### 1.3. Log vs. level

.pull-left[

`$\widehat{\beta_{lvl}} = 1.0121933$`

`$$\begin{align}(15\div100) \times \widehat{\beta_{lvl}} &\approx (15\div100) \times 1.0121933 \\ &\approx 0.151829 \end{align}$$`
]

.pull-right[

`$\widehat{\beta_{log}} = 0.16875$`

`$$\begin{align}0.151829 \div90 &= 0.0016870\\ & \approx \beta_{log}\%\end{align}$$`
]

---

### 1. Point estimates

#### 1.3. Log vs. level

<ul>
 <li>Thus the interpretation differs depending on whether variables are in log or in level:</li>
 <ul>
 <li>When variables are in level we should interpret the coefficients in terms of unit increase</li>
 <li>When variables are in log we should interpret the coefficients in terms of percentage increase</li>
 </ul>
</ul>

--

<table class="table table-hover table-condensed" style="width: auto !important; margin-left: auto; margin-right: auto;font-size: 20px;">
<caption>Interpretation of the regression coefficient</caption>
 <thead>
 <tr style = "background-color: #CCD5D9;">
 <th style="text-align:center;"> </th>
 <th style="text-align:center;"> y </th>
 <th style="text-align:center;"> log(y) </th>
 </tr>
 </thead>
<tbody>
 <tr>
 <td style="text-align:center;font-weight: bold;"> x </td>
 <td style="text-align:center;width: 10em; "> $\hat{\beta}$ is the unit increase in $y$ due
 to a 1 unit increase in $x$ </td>
 <td style="text-align:center;width: 10em; "> $\hat{\beta}\times 100$ is the % increase in $y$ due
 to a 1 unit increase in $x$ </td>
 </tr>
 <tr style = "background-color: #CCD5D9;">
 <td style="text-align:center;font-weight: bold;"> log(x) </td>
 <td style="text-align:center;width: 10em;"> $\hat{\beta}\div 100$ is the unit increase in $y$ due
 to a 1% increase in $x$ </td>
 <td style="text-align:center;width: 10em;"> $\hat{\beta}$ is the % increase in $y$ due
 to a 1% increase in $x$ </td>
 </tr>
</tbody>
</table>

---

### 1. Point estimates

#### 1.3. Log vs. level

<ul>
 <li>Let's give it a try with our example on household income and birth weight</li>
 <ul>
 <li>We've already seen that because income is log-normally distributed, it should be included in log</li>
 </ul>
</ul>

---

### 1. Point estimates

#### 1.3. Log vs. level

<ul>
 <li>So what would be your interpretation of the slope estimated from the following regression?</li>
</ul>

`$$\text{Birth weight}_i = \alpha + \beta \log(\text{Household income}_i) + \varepsilon$$`

```r
lm(birth_weight ~ log(household_income), data)$coefficients
```

```
##           (Intercept) log(household_income) 
##             2091.2323              112.3234
```

<ul>
 <li>With a continuous $y$ in level and a logged $x$ variable, the template would be:</li>
</ul>

<center>"Everything else equal, a 1 percent increase in [x] is associated with a [beta/100] [unit] [in/de]crease in [y] on average."</center>

<ul>
 <li>So in our case, the adequate interpretation would be:</li>
</ul>

<center>"Everything else equal, a 1 percent increase in annual household income is associated with a 1.12 grams increase in the birth weight of the newborn on average."</center>

---

<h3>Overview</h3>

.pull-left[

<ul style = "margin-left:1.5cm;list-style: none">
 <li>1. Point estimates &#10004;</li>
 <ul style = "list-style: none">
 <li>1.1. Continuous variables</li>
 <li>1.2. Discrete variables</li>
 <li>1.3. Log vs. level</li>
 </ul>
</ul>

<ul style = "margin-left:1.5cm;list-style: none">
 <li>2. Practice interpretation</li>
</ul> 
]

<ul style = "margin-left:-1cm;list-style: none"><li>4. Wrap up!</li></ul>
]

---

<h3>Overview</h3>

.pull-left[

<ul style = "margin-left:1.5cm;list-style: none">
 <li>2. Practice interpretation</li>
</ul> 
]

---

class: inverse, hide-logo

### 2. Practice interpretation

#### &#10140; Let's practice coefficient interpration with randomly generated relationships:

---

<h3>Overview</h3>

.pull-left[

<ul style = "margin-left:1.5cm;list-style: none">
 <li>2. Practice interpretation &#10004;</li>
</ul> 
]

<ul style = "margin-left:-1cm;list-style: none"><li>4. Wrap up!</li></ul>
]

---

<h3>Overview</h3>

.pull-left[

<ul style = "margin-left:1.5cm;list-style: none">
 <li>2. Practice interpretation &#10004;</li>
</ul> 
]

]

---

### 3. Regression tables

#### 3.1. Layout

<ul>
 <li>So far we've been used to regression results displayed this way:</li>
</ul>

```r
lm(birth_weight ~ household_income, data)$coefficients
```

```
##      (Intercept) household_income 
##     3.134528e+03     2.213871e-03
```

<ul>
 <li>Or with the more exhaustive summary() coefficients output:</li>
</ul>

```r
summary(lm(birth_weight ~ household_income, data))$coefficients
```

```
##                      Estimate   Std. Error    t value     Pr(>|t|)
## (Intercept)      3.134528e+03 1.656840e+01 189.187165 0.000000e+00
## household_income 2.213871e-03 2.808507e-04   7.882732 8.355367e-15
```

<center>&#10140; But in formal reports and academic papers, the layout of regression tables is a bit different</center>

---

### 3. Regression tables

#### 3.1. Layout

tr th td{
  font-size: 15px;
  margin: auto;
  border-top: 0px;
  border-bottom: 0px;
}

.remark-slide thead, .remark-slide tfoot, .remark-slide tr:nth-child(even) {
 background: var(--background-color);
}
</style>

.pull-left[

<table style="text-align:center"><tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="2">Dependent variable:</td></tr>
<tr><td></td><td colspan="2" style="border-bottom: 1px solid black"></td></tr>
<tr><td style="text-align:left"></td><td colspan="2">Birth weight</td></tr>
<tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Household income</td><td>0.002***</td><td>0.002***</td></tr>
<tr><td style="text-align:left"></td><td>(0.0003)</td><td>(0.0003)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td style="text-align:left">Girl (ref: Boy)</td><td></td><td>-135.218***</td></tr>
<tr><td style="text-align:left"></td><td></td><td>(34.838)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td style="text-align:left">Constant</td><td>3,134.528***</td><td>3,246.365***</td></tr>
<tr><td style="text-align:left"></td><td>(16.568)</td><td>(34.257)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>1,000</td><td>963</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Note:</td><td colspan="2" style="text-align:right">*p<0.1; **p<0.05; ***p<0.01</td></tr>
</table>
]

.pull-right[

Regression tables often contain multiple regressions:

<ul>
 <li>With one regression in each column</li>
 <ul>
 <li>Regression models are numbered</li>
 <li>Dependent variable mentioned above</li>
 </ul>
</ul>

<ul>
 <li>And one variable in each row</li>
 <ul>
 <li>With the point estimate</li>
 <li>And a precision measure below</li>
 </ul>
</ul>

<ul>
 <li>General info on each model at the bottom</li>
</ul>

<ul>
 <li>A symbology for the p-value testing whether the coefficient is significantly different from 0 or not</li>
</ul>

]

---

### 3. Regression tables

#### 3.1. Layout

.pull-left[

.pull-right[

It makes it easy to compare the different models:

<ul>
 <li>We can add controls progressively</li>
 <ul>
 <li>Check the stability of the main coefficient</li>
 </ul>
</ul>

<center>&#10140; If it gets significantly closer to 0 it might indicate that the raw relationship was fallaciously driven by a confounding factor</center>

<ul>
 <li>And compare general statistics</li>
 <ul>
 <li>N is lower in the second regression</li>
 <li>It means that there are missing values</li>
 <li>Could this induce a selection bias?</li>
 </ul>
</ul>
]

---

### 3. Regression tables

#### 3.2. Reported significance

.pull-left[

.pull-right[

It makes it easy to compare the different models:

<ul>
 <li>The evolution of the significance matters as well</li>
 <ul>
 <li>The main coefficient should stay significant</li>
 </ul>
</ul>

<ul>
 <li>But don't rely too much on the symbology</li>
 <ul>
 <li>Thresholds are not always the same</li>
 <li>Sometimes there are none</li>
 </ul>
</ul>

<ul>
 <li>Instead, keep in mind this rule of thumb:</li>
</ul>

<center>&#10140; A coefficient $\approx$ twice larger than its standard error has a p-value of $\approx$ 5%</center>
]

---

### 3. Regression tables

#### 3.2. Reported significance

<ul>
 <li>Remember the formula for the confidence interval:</li>
 <ul>
 <li>We can fix the confidence level $1 - \alpha$ to 95% and check how $t$ varies with $\text{df}$</li>
 </ul>
</ul>

`$$\hat{\beta}\pm t(\text{df})_{1-\frac{\alpha}{2}}\times\text{se}(\hat{\beta})$$`

---

### 3. Regression tables

#### 3.2. Reported significance

<ul>
 <li>As soon as you have about 20 observations more than you have parameters to estimate:</li>
 <ul>
 <li>The $t$ value gets very close to 2</li>
 <li>And as $\text{df}$ increases it quickly converges to $\approx$ 2</li>
 </ul>
</ul>

<ul>
 <li>The coefficient is statistically significant if the lower bound of its (absolute) confidence interval is larger than 0</li>
 <ul>
 <li>Which is an easy calculation if we approximate the $t$ value by 2</li>
 <li>A reasonable approximation for a back of the envelope calculation unless there are very few observations</li>
 </ul>
</ul>

.pull-left[

<ul>
 <li>The (absolute) lower bound of the CI writes:</li>
</ul>

`$$|\hat{\beta}| - t(\text{df})_{1-\frac{\alpha}{2}}\times\text{se}(\hat{\beta})$$`
`$$|\hat{\beta}| - 2\times\text{se}(\hat{\beta}) > 0$$`

`$$|\hat{\beta}| > 2\times\text{se}(\hat{\beta})$$`
]

.pull-right[

<center>So if the coefficient is clearly more than twice larger than it's standard error, it must be statistically significant at the 5% significance level</center>

<center>&#10140; But sometimes the p-value or the confidence interval is reported instead of the standard error</center>

]

---

### 3. Regression tables

#### 3.2. Reported significance

.pull-left[

<table style="text-align:center"><tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="2">Dependent variable:</td></tr>
<tr><td></td><td colspan="2" style="border-bottom: 1px solid black"></td></tr>
<tr><td style="text-align:left"></td><td colspan="2">Birth weight</td></tr>
<tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Household income</td><td>0.002***</td><td>0.002***</td></tr>
<tr><td style="text-align:left"></td><td>p = 0.000</td><td>p = 0.000</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td style="text-align:left">Girl (ref: Boy)</td><td></td><td>-135.218***</td></tr>
<tr><td style="text-align:left"></td><td></td><td>p = 0.0002</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td style="text-align:left">Constant</td><td>3,134.528***</td><td>3,246.365***</td></tr>
<tr><td style="text-align:left"></td><td>p = 0.000</td><td>p = 0.000</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>1,000</td><td>963</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Note:</td><td colspan="2" style="text-align:right">*p<0.1; **p<0.05; ***p<0.01</td></tr>
</table>
]
.pull-right[

<table style="text-align:center"><tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="2">Dependent variable:</td></tr>
<tr><td></td><td colspan="2" style="border-bottom: 1px solid black"></td></tr>
<tr><td style="text-align:left"></td><td colspan="2">Birth weight</td></tr>
<tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Household income</td><td>0.002***</td><td>0.002***</td></tr>
<tr><td style="text-align:left"></td><td>(0.002, 0.003)</td><td>(0.002, 0.003)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td style="text-align:left">Girl (ref: Boy)</td><td></td><td>-135.218***</td></tr>
<tr><td style="text-align:left"></td><td></td><td>(-203.500, -66.936)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td style="text-align:left">Constant</td><td>3,134.528***</td><td>3,246.365***</td></tr>
<tr><td style="text-align:left"></td><td>(3,102.055, 3,167.002)</td><td>(3,179.223, 3,313.507)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>1,000</td><td>963</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Note:</td><td colspan="2" style="text-align:right">*p<0.1; **p<0.05; ***p<0.01</td></tr>
</table>
]

---

### 3. Regression tables

#### 3.3. R squared

---

### 3. Regression tables

#### 3.3. R squared

<ul>
 <li>In regression tables, the R2 of the model is always reported below the number of observations</li>
 <ul>
 <li>The R2 captures how well the model fits the data</li>
 <li>The model has a good fit (high R2) on dataset A but a poor fit (low R2) on dataset B</li>
 </ul>
</ul>

---

### 3. Regression tables

#### 3.3. R squared

<ul>
 <li>The standard error already gives an idea on the goodness of the fit, but it is expressed in the same unit as $y$</li>
 <ul>
 <li>So we cannot compare two different models based on that statistic</li>
 <li></li>
 </ul>
</ul>

---

### 3. Regression tables

#### 3.3. R squared

---

### 3. Regression tables

#### 3.3. R squared

<ul>
 <li>The R2 captures the goodness of fit as the percentage of the $y$ variation captured by the model, from:</li>
 <ul>
 <li></li>
 <li></li>
 </ul>
</ul>

---

### 3. Regression tables

#### 3.3. R squared

---

### 3. Regression tables

#### 3.3. R squared

<ul>
 <li>The R2 captures the goodness of fit as the percentage of the $y$ variation captured by the model, from:</li>
 <ul>
 <li>The total variation of the y variable (its variance $\sum_{i = 1}^n(y_i-\bar{y})^2$)</li>
 <li>The remaining variation of the y variable once its modeled (the sum of squared residuals $\sum_{i = 1}^n\hat{\varepsilon_i}^2$)</li>
 </ul>
</ul>

---

### 3. Regression tables

#### 3.3. R squared

<ul>
 <li>We can then obtain a proper formula from the following reasoning</li>
</ul>

$$\text{Total variation} = \text{Explained variation} + \text{Remaining variation} $$

`$$\frac{\text{Explained variation}}{\text{Total variation}} = 1 - \frac{\text{Remaining variation}}{\text{Total variation}}$$`

`$$\frac{\text{Explained variation}}{\text{Total variation}} = 1 - \frac{\sum_{i = 1}^n\hat{\varepsilon_i}^2}{\sum_{i = 1}^n(y_i-\bar{y})^2} \equiv \text{R}^2$$`

<ul>
 <li>Because all the terms are sums of squares, we usually talk about:</li>
 <ul>
 <li>Total Sum of Squares (TSS)</li>
 <li>Explained Sum of Squares (ESS)</li>
 <li>Residual Sum of Squares (RSS)</li>
 </ul>
</ul>

---

### 3. Regression tables

#### 3.3. R squared

<ul>
 <li>Note that the TSS is actually the variance of $y$:</li>
 <ul>
 <li>So the R2 is interpreted as the share of the variance of $y$ which is explained by the model</li>
 <li>And as such, the R2 is always comprised between 0 and 1</li>
 </ul>
</ul>

`$$\text{R}^2 = 1 - \frac{\sum_{i = 1}^n\hat{\varepsilon_i}^2}{\sum_{i = 1}^n(y_i-\bar{y})^2} = \frac{\text{Explained variation}}{\text{Total variation}}$$`

<ul>
 <li>An undesirable property of the R2 is that it mechanically increases with the number of dependent variables</li>
 <ul>
 <li>Such that with many variables the R2 tends to overestimate the goodness of the fit</li>
 <li>This is why you will sometimes see some Adjusted R2</li>
 </ul>
</ul>

`$$\text{Adjusted R}^2 = 1 - \frac{(1 - \text{R}^2)(n-1)}{n - \#\text{parameters}}$$`

---

<h3>Overview</h3>

.pull-left[

<ul style = "margin-left:1.5cm;list-style: none">
 <li>2. Practice interpretation &#10004;</li>
</ul> 
]

.pull-right[
<ul style = "margin-left:-1cm;list-style: none">
 <li>3. Regression tables &#10004;</li>
 <ul style = "list-style: none">
 <li>3.1. Layout</li>
 <li>3.2. Reported significance</li>
 <li>3.3. R squared</li>
 </ul>
</ul>

<ul style = "margin-left:-1cm;list-style: none"><li>4. Wrap up!</li></ul>
]

---

### 4. Wrap up!

#### Standard interpretations

<ul>
 <li>When both $x$ and $y$ are continuous, the general template for the interpretation of $\hat{\beta}$ is:</li>
</ul>

<center>"Everything else equal, a 1 [unit] increase in [x] is associated with an [in/de]crease of [beta] [units] in [y] on average."</center>

<ul>
 <li>With a discrete $x$, the interpretation of the coefficient must be relative to the reference category:</li>
</ul>

<center>"Everything else equal, belonging to the [x category] is associated with a [beta] [unit] [higher/lower] average [y] relative to the [reference category]."</center>

<ul>
 <li>With a binary $y$ variable, the coefficient must be interpreted in percentage points:</li>
</ul>

---

### 4. Wrap up!

#### Interpretations with variable transformation

.pull-left[
<center>Standardization</center>

<ul>
 <li>To standardize a variable is to divide it by its SD</li>
 <ul>
 <li>The variation of a standardized variable should not be interpreted in units but in SD</li>
 <li>For instance if $x$ and $y$ are continuous and $x$ is standardized, the interpretation becomes:</li>
 </ul>
</ul>

<center>"Everything else equal, a 1 standard deviation increase in [x] is associated with an [in/de]crease of [beta] [units] in [y] on average."</center>

<ul>
 <li>If both $x$ and $y$ are standardized, the slope is the correlation coefficient between $x$ and $y$</li>
</ul>

]

.pull-right[
<center>Log-transformation</center>

<ul>
 <li>The log transformation allows to interpret the coefficient in percentage:</li>
</ul>

---

### 4. Wrap up!

#### Regression table layout

.pull-left[

<table style="text-align:center"><tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="2">Birth weight</td></tr>
<tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Household income</td><td>0.002***</td><td>0.002***</td></tr>
<tr><td style="text-align:left"></td><td>(0.0003)</td><td>(0.0003)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td style="text-align:left">Girl (ref: Boy)</td><td></td><td>-135.218***</td></tr>
<tr><td style="text-align:left"></td><td></td><td>(34.838)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td style="text-align:left">Constant</td><td>3,134.528***</td><td>3,246.365***</td></tr>
<tr><td style="text-align:left"></td><td>(16.568)</td><td>(34.257)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>1,000</td><td>963</td></tr>
<tr><td style="text-align:left">R2</td><td>0.059</td><td>0.074</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Note:</td><td colspan="2" style="text-align:right">*p<0.1; **p<0.05; ***p<0.01</td></tr>
</table>
]

.pull-right[

Regression tables often contain multiple regressions:

<ul>
 <li>With one regression in each column</li>
</ul>

<ul>
 <li>And one variable in each row</li>
 <ul>
 <li>With the point estimate</li>
 <li>And a precision measure below</li>
 </ul>
</ul>

<ul>
 <li>General info on each model at the bottom</li>
 <ul>
 <li>Number of observations</li>
 <li>$\text{R}^2 = 1 - \frac{\sum_{i = 1}^n\hat{\varepsilon_i}^2}{\sum_{i = 1}^n(y_i-\bar{y})^2}$</li>
 </ul>
</ul>

<ul>
 <li>A symbology for the p-value testing whether the coefficient is significantly different from 0 or not</li>
</ul>

]