Applications in academic research

# Applications in academic research
## Lecture 13
### Louis SIRUGUE
### CPES 2 - Fall 2022

---

### Quick reminder

#### Standard interpretations

<ul>
 <li>When both $x$ and $y$ are continuous, the general template for the interpretation of $\hat{\beta}$ is:</li>
</ul>

<center>"Everything else equal, a 1 [unit] increase in [x] is associated with an [in/de]crease of [beta] [units] in [y] on average."</center>

<ul>
 <li>With a discrete $x$, the interpretation of the coefficient must be relative to the reference category:</li>
</ul>

<center>"Everything else equal, belonging to the [x category] is associated with a [beta] [unit] [higher/lower] average [y] relative to the [reference category]."</center>

<ul>
 <li>With a binary $y$ variable, the coefficient must be interpreted in percentage points:</li>
</ul>

<center>"Everything else equal, a 1 [unit] increase in [x] is associated with a [beta $\times$ 100] percentage point [in/de]crease in the probability that [y equals 1] on average."</center>

---

### Quick reminder

#### Interpretations with variable transformation

<ul>
 <li>To standardize a variable is to divide it by its SD</li>
 <ul>
 <li>The variation of a standardized variable should not be interpreted in units but in SD</li>
 <li>For instance if $x$ and $y$ are continuous and $x$ is standardized, the interpretation becomes:</li>
 </ul>
</ul>

<center>"Everything else equal, a 1 standard deviation increase in [x] is associated with an [in/de]crease of [beta] [units] in [y] on average."</center>

<ul>
 <li>If both $x$ and $y$ are standardized, the slope is the correlation coefficient between $x$ and $y$</li>
</ul>

]

<ul>
 <li>The log transformation allows to interpret the coefficient in percentage terms:</li>
</ul>

<table class="table table-hover table-condensed" style="width: auto !important; margin-left: auto; margin-right: auto;font-size: 20px;">
<caption>Interpretation of the regression coefficient</caption>
 <thead>
 <tr style = "background-color: #CCD5D9;">
 <th style="text-align:center;"> </th>
 <th style="text-align:center;"> y </th>
 <th style="text-align:center;"> log(y) </th>
 </tr>
 </thead>
<tbody>
 <tr>
 <td style="text-align:center;font-weight: bold;"> x </td>
 <td style="text-align:center;width: 10em; "> $\hat{\beta}$ is the unit increase in $y$ due
 to a 1 unit increase in $x$ </td>
 <td style="text-align:center;width: 10em; "> $\hat{\beta}\times 100$ is the % increase in $y$ due
 to a 1 unit increase in $x$ </td>
 </tr>
 <tr style = "background-color: #CCD5D9;">
 <td style="text-align:center;font-weight: bold;"> log(x) </td>
 <td style="text-align:center;width: 10em;"> $\hat{\beta}\div 100$ is the unit increase in $y$ due
 to a 1% increase in $x$ </td>
 <td style="text-align:center;width: 10em;"> $\hat{\beta}$ is the % increase in $y$ due
 to a 1% increase in $x$ </td>
 </tr>
</tbody>
</table>
]

---

### Quick reminder

#### Regression table layout

tr th td{
  font-size: 15px;
  margin: auto;
  border-top: 0px;
  border-bottom: 0px;
}

.remark-slide thead, .remark-slide tfoot, .remark-slide tr:nth-child(even) {
 background: var(--background-color);
}
</style>

<table style="text-align:center"><tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="2">Birth weight</td></tr>
<tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Household income</td><td>0.002***</td><td>0.002***</td></tr>
<tr><td style="text-align:left"></td><td>(0.0003)</td><td>(0.0003)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td style="text-align:left">Girl (ref: Boy)</td><td></td><td>-141.943***</td></tr>
<tr><td style="text-align:left"></td><td></td><td>(34.878)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td style="text-align:left">Constant</td><td>3,127.146***</td><td>3,247.126***</td></tr>
<tr><td style="text-align:left"></td><td>(16.188)</td><td>(33.520)</td></tr>
<tr><td style="text-align:left"></td><td></td><td></td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>1,000</td><td>963</td></tr>
<tr><td style="text-align:left">R2</td><td>0.047</td><td>0.063</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Note:</td><td colspan="2" style="text-align:right">*p<0.1; **p<0.05; ***p<0.01</td></tr>
</table>
]

Regression tables often contain multiple regressions:

<ul>
 <li>With one regression in each column</li>
</ul>

<ul>
 <li>And one variable in each row</li>
 <ul>
 <li>With the point estimate</li>
 <li>And a precision measure below</li>
 </ul>
</ul>

<ul>
 <li>General info on each model at the bottom</li>
 <ul>
 <li>Number of observations</li>
 <li>$\text{R}^2 = 1 - \frac{\sum_{i = 1}^n\hat{\varepsilon_i}^2}{\sum_{i = 1}^n(y_i-\bar{y})^2}$</li>
 </ul>
</ul>

<ul>
 <li>A symbology for the p-value testing whether the coefficient is significantly different from 0 or not</li>
</ul>

]

---

<h3>Today: Applications in academic research</h3>

<ul style = "margin-left:-.5cm;list-style: none">
 <li>1. Causal approach (Behaghel et al., 2015)</li>
 <ul style = "list-style: none">
 <li>1.1. Structure</li>
 <li>1.2. Data</li>
 <li>1.3. Analysis</li>
 </ul>
</ul>

<ul style = "margin-left:-.5cm;list-style: none">
 <li>2. Correlational approach (Chetty et al., 2014)</li>
 <ul style = "list-style: none">
 <li>2.1. Empirical approach</li>
 <li>2.2. National results</li>
 <li>2.3. Spatial variations</li>
 <li>2.4. Correlational analysis</li>
 </ul>
</ul>

]

.pull-right[
<ul style = "margin-left:-1cm;list-style: none">
 <li>3. Structural approach (Nerlove, 1963)</li>
 <ul style = "list-style: none">
 <li>3.1. Motivation</li>
 <li>3.2. Theoretical modeling</li>
 <li>3.3. Regression expression</li>
 </ul>
</ul>

<ul style = "margin-left:-1cm;list-style: none"><li>4. Wrap up!</li></ul>
]

---

<h3>Today: Applications in academic research</h3>

]

---

### 1. Causal approach (Behaghel et al., 2015)

#### 1.1. Structure

* Research papers always start with an abstract that briefly describes the study:
 
--

---

### 1. Causal approach (Behaghel et al., 2015)

#### 1.1. Structure

Typical structure of an empirical research paper:

<ul>
 <li>Introduction/literature</li>
 
 <li>Data/Descriptive statistics</li>
 
 <li>Empirical framework</li>
 
 <li>Results</li>
 
 <li>(Heterogeneity)</li>
 
 <li>Robustness checks</li>
 
 <li>Conclusion</li>
</ul>

]

Structure of Behaghel et al. (2015) is this one:

<ul>
 <li>Introduction</li>
 <li>Institutional Background</li>
 <li>Experiment and Data Collection</li>
 <ul>
 <li>Program and Experimental Design</li> 
 <li>Data Collection</li> 
 </ul>
 <li>Impact of Anonymous Résumés</li>
 <ul>
 <li>Interview Rates</li> 
 <li>Hiring Rates</li> 
 <li>Recruitment Success</li> 
 <li>Robustness Checks</li> 
 </ul>
 <li>Mechanisms</li>
 <ul>
 <li>Firms’ Participation Decision</li> 
 <li>Résumé Valuation by Participating Firms</li> 
 </ul>
 <li>Conclusion</li>
</ul>

]

---

### 1. Causal approach (Behaghel et al., 2015)

#### 1.1. Structure

<center><h4>Program and Experimental Design</h4></center>

<ol>
 <li>Firm entry in the program: Firms with more than 50 employees posting vacancies lasting at least three months at the public employment service (PES) were offered to enter the program, which consists in having a 50% chance to receive anonymized instead of standard resumes for that vacancy.</li>
 
 <li>Matching of resumes with vacancies: The PES posts the vacancy on a variety of media, including a public website asking interested job seekers to apply through the PES branch. The PES agent selects resumes from these applicants and from internal databases of job seekers.</li>
 
 <li>Randomization and anonymization: Resumes are randomly anonymized or not with a 50% probability and sent to the employer.</li>
 
 <li>Selection of resumes by the employer: The employer selects the resumes of applicants she would like to interview and contact them (through the PES if resumes are anonymized).</li>
</ol>

---

### 1. Causal approach (Behaghel et al., 2015)

#### 1.2. Data

<center><h4>Data sources</h4></center>

<ol>
 <li>Administrative data</li>
 <ul>
 <li>Coverage: All firms and all job seekers who used the public employment services in the experimental areas during (and after) the program</li>
 <li>Content: information on the firm (size, sector), on the job position offered (occupation level, type of contract) and limited information on candidates (unless the candidate is filed as unemployed)</li>
 </ul>
 
 <li>Telephone interviews:</li>
 <ul>
 <li>Coverage: All firms entering the program, a subsample of firms that declined to participate, subsamples of applicants to vacancies posted by these two groups of firms both during and after the experiment</li>
 <li>Content: additional characteristics of the vacancy and of the recruiter (characteristics that could be associated with a differential treatment of candidates), questions on the result of the recruitment (time to hiring and match quality)</li>
 </ul>
</ol>

---

### 1. Causal approach (Behaghel et al., 2015)

#### 1.2. Data

<center>Sample description</center>

<ul>
 <li>1,005 firms entered the program (608 declined):</li>
 <ul>
 <li>385 firms in the control group</li>
 <li>366 firms in the treatment group</li>
 <li>254 firms not allocated because canceled or job filled too early</li>
 </ul>
 
 <li>Sample of 1,268 applicants:</li>
 <ul>
 <li>660 to vacancies from the control group</li>
 <li>608 to vacancies from the treatment group</li>
 <li>203 to vacancies from firms that withdrew before randomization</li>
 </ul>
 
 <li>Main variables:</li>
 <ul>
 <li>Whether the candidates is from the minority or the majority</li>
 <li>Whether the resume was anonymized</li>
 <li>Whether the employer called back for an interview</li>
 </ul>
</ul>

]

<ul>
 <li>Authors use sampling weights:</li>
 <ul>
 <li>Representativity of the sample</li>
 <li>Non-response bias correction</li>
 <li>The weight associated with an individual can be viewed as the number of individuals she represents</li>
 </ul>
</ul>

]

---

### 1. Causal approach (Behaghel et al., 2015)

#### 1.2. Data

* Import the data

```r
library(haven)
data_rct <- read_dta("data_candidates_mainsample.dta")
View(data_rct)
```

---

### 1. Causal approach (Behaghel et al., 2015)

#### 1.2. Data

* Subset the data

```r
data_rct <- data_rct %>% 
 filter(!is.na(CVA)) %>% # Keep participating firms
 select(treatment = CVA, # Select and rename variables 
 minority = ZouI, # of interest
 interview = ENTRETIEN, 
 weight = POIDS_SEL)

head(data_rct, 5)
```

```
## # A tibble: 5 x 4
## treatment minority interview weight
## <dbl> <dbl> <dbl> <dbl>
## 1 1 0 0 5.35
## 2 1 1 0 5.35
## 3 0 0 0 2.68
## 4 0 1 0 2.68
## 5 0 0 0 5.35
```
]

<center>&#10140; We want to know whether anonymizing resumes helped reducing labor market discrimination toward the minority group</center>

]

---

### 1. Causal approach (Behaghel et al., 2015)

#### 1.3. Analysis

<ul>
 <li>Authors use the following notations</li>
 <ul>
 <li>$An$ indicates whether the resume is anonymous</li>
 <li>$D$ indicates whether the candidate is from the minority</li>
 <li>$Y$ indicates whether the candidate obtained an interview</li>
 </ul>
</ul>

<ul>
 <li>The parameter of interest then writes:</li>
</ul>

`$$\delta = \underbrace{(\overline{Y}^{An = 1, D = 1} - \overline{Y}^{An = 1, D = 0})}_{\substack{\text{Difference in interview rates}\\ \text{between the majority and the minority}\\ \text{when resumes are anonymized}}} - \underbrace{(\overline{Y}^{An = 0, D = 1} - \overline{Y}^{An = 0, D = 0})}_{\substack{\text{Difference in interview rates}\\ \text{between the majority and the minority}\\ \text{when resumes are } \underline{\text{not}} \text{ anonymized}}}$$`

<center>&#10140; What sign do you expect for $\delta$?</center>

---

### 1. Causal approach (Behaghel et al., 2015)

#### 1.3. Analysis

```r
means <- data_rct %>% 
 group_by(treatment, minority) %>%
 summarise(means = weighted.mean(interview, weight))
```

<table class="table table-hover table-condensed" style="width: auto !important; margin-left: auto; margin-right: auto;">
<caption></caption>
 <thead>
 <tr>
 <th style="text-align:right;"> treatment </th>
 <th style="text-align:right;"> minority </th>
 <th style="text-align:right;"> means </th>
 </tr>
 </thead>
<tbody>
 <tr>
 <td style="text-align:right;"> 0 </td>
 <td style="text-align:right;"> 0 </td>
 <td style="text-align:right;"> 0.12 </td>
 </tr>
 <tr>
 <td style="text-align:right;"> 0 </td>
 <td style="text-align:right;"> 1 </td>
 <td style="text-align:right;"> 0.09 </td>
 </tr>
 <tr>
 <td style="text-align:right;"> 1 </td>
 <td style="text-align:right;"> 0 </td>
 <td style="text-align:right;"> 0.18 </td>
 </tr>
 <tr>
 <td style="text-align:right;"> 1 </td>
 <td style="text-align:right;"> 1 </td>
 <td style="text-align:right;"> 0.05 </td>
 </tr>
</tbody>
</table>

```r
means <- means %>% group_by(treatment) %>%
 summarise(discrim = means[2] - means[1]) 
```

]

```r
means$discrim[2] - means$discrim[1]
```

```
## [1] -0.1067092
```

<center>The interview rate of the minority is even lower than the majority in the treatment group</center>

]

---

### Practice

#### 1) Estimate this parameter of interest using a regression

*Hint: To apply weights in a regression you can indicate the weighting variable in the* `weights` *argument*

```r
lm(y ~ x1 + x2 + ..., data, weights = )
```

* Reminder:

```r
library(tidyverse)
library(haven)
data_rct <- read_dta("data_candidates_mainsample.dta") %>% # read .dta data
 filter(!is.na(CVA)) %>% # Keep participating firms
 rename(treatment = CVA, minority = ZouI, # Rename variables of interest
 interview = ENTRETIEN, weight = POIDS_SEL) %>% 
 select(treatment, minority, interview, weight) # Select variables of interest
```

---

### Solution

<ul>
 <li>We want to see how the effect of the minority variable varies with the treatment variable</li>
 <ul>
 <li>In the regression framework, this is what interactions allow to capture</li>
 </ul>
</ul>

`$$Y_i = \alpha + \beta D_i +\gamma An_i + \delta D_i\times An_i + \varepsilon_i$$`

```r
summary(lm(interview ~ minority + treatment + minority*treatment, 
           data_rct, weights = weight))$coefficients
```

```
##                       Estimate Std. Error    t value     Pr(>|t|)
## (Intercept)         0.11638530 0.01630149  7.1395491 1.575140e-12
## minority           -0.02365790 0.02368346 -0.9989208 3.180243e-01
## treatment           0.06101349 0.02419977  2.5212424 1.181630e-02
## minority:treatment -0.10670915 0.03479712 -3.0666092 2.210982e-03
```

<ul>
 <ul>
 <li>$\alpha$ is the interview rate for individuals in both reference groups (majority/control)</li>
 <li>$\beta$ is the difference in means between the minority and the majority in the control group</li>
 <li>$\gamma$ is the difference in means between the treatment and the control group for the majority group</li>
 <li>$\delta$ is how this difference in means between the minority and the majority differ between the treatment and the control group</li>
 </ul>
</ul>

---

### 1. Causal approach (Behaghel et al., 2015)

#### 1.3. Analysis

<ul>
 <li>Why the effect is negative?</li>
</ul>
 
--
 
<center><img src = "table7.png" width = "600"/></center>

<ul>
 <li>Compare the interview rates of the control group to those of non-participating firms</li>
 <ul>
 <li>Non-participating firms interview way less the minority compared to the control group</li>
 <li>Only firms who interview as much from the minority as from the majority entered the program</li>
 </ul>
</ul>

<center>&#10140; Selection bias</center>

---

<h3>Overview</h3>

<ul style = "margin-left:-.5cm;list-style: none">
 <li>1. Causal approach (Behaghel et al., 2015) &#10004;</li>
 <ul style = "list-style: none">
 <li>1.1. Structure</li>
 <li>1.2. Data</li>
 <li>1.3. Analysis</li>
 </ul>
</ul>

]

<ul style = "margin-left:-1cm;list-style: none"><li>4. Wrap up!</li></ul>
]

---

<h3>Overview</h3>

]

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.1. Empirical approach

<center><img src = "chetty_abstract.png" width = "850"/></center>
 
---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.1. Empirical approach

* How to characterize the joint distribution of parent and child income?
 
--

**The intergenerational elasticity:**

`$$\log(y^c_i) = \alpha + \beta_{IGE}\log(y^p_i)+\varepsilon_i$$`
&#10140; `$\hat{\beta}$` would be the expected percentage increase in child income for a 1% increase in parent income

**The rank-rank correlation:**

`$$\text{percentile}(y^c_i) = \alpha + \beta_{RRC}\text{percentile}(y^p_i)+\varepsilon_i$$`

* In this particular case, because the dependant and the independant variables have the same variance, the regression coefficient equals the correlation coefficient
    
---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.1. Empirical approach

`$$\begin{align}\beta & = \frac{\text{Cov}(x, y)}{\text{Var}(x)}\\[1em]
& = \frac{\text{Cov}(x, y)}{\text{SD}(x)\times\text{SD}(x)} \times \frac{\text{SD}(y)}{\text{SD}(y)}\\[1em]
& = \frac{\text{Cov}(x, y)}{\text{SD}(x)\times\text{SD}(y)} \times \frac{\text{SD}(y)}{\text{SD}(x)}\\[1em]
& = \text{Cor}(x, y) \times \frac{\text{SD}(y)}{\text{SD}(x)}\end{align}$$`

]

<ul>
 <li>$\text{SD}(\log(y^c_i)) \lesseqgtr \text{SD}(\log(y^p_i))$</li>
 <ul>
 <li>The standard deviation of log income can be viewed as a measure of inequality</li>
 <li>The IGE is sensitive to relative inequality across generations</li>
 </ul>
</ul>

<ul>
 <li>$\text{SD}(\text{percentile}(y^c_i)) = \text{SD}(\text{percentile}(y^p_i))$</li>
 <ul>
 <li>The RRC is not sensitive to relative inequality across generations</li>
 <li>And the regression coefficient indeed equals the correlation coefficient</li>
 </ul>
</ul>
]
 
---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.1. Empirical approach

<ul>
 <li>Here is what the fit of the relationship between parents and child income ranks looks like</li>
 <ul>
 <li>We can't see much, even at 1% opacity</li>
 <li>We can't even tell whether or not a linear fit is appropriate</li>
 </ul>
</ul>

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.1. Empirical approach

<ul>
 <li>So authors compute the average child rank for each parent percentile group</li>
 <ul>
 <li>The resulting visual representation is much clearer</li>
 <li>And it allows to see whether or not a linear specification is appropriate</li>
 </ul>
</ul>

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.2. National results

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.2. National results

<ul>
 <li>Authors do the same for the IGE</li>
</ul>

<ul>
 <li>For each parent income percentile:</li>
 <ul>
 <li>$x$: Mean parent log income</li>
 <li>$y$: Mean child log income</li>
 </ul>
</ul>

<ul>
 <li>The relationship is non-linear</li>
</ul>
]

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.3. Spatial variations

<ul>
 <li>Then authors estimate the rank-rank regression separately for each commuting zone</li>
</ul>

`$$\text{percentile}(y^c_i) = \alpha + \beta_{RRC}\text{percentile}(y^p_i)+\varepsilon_i$$`

<ul>
 <li>From these local estimations they derive two statistics:</li>
</ul>

<center>Relative mobility: $\hat\beta_{RRC}$</center>

<ul>
 <li>The slope of the rank-rank relationship</li>
 <ul>
 <li>Expected rank increase for a children had their parents been ranked 1 percentile higher</li>
 <li>The estimated increase indicates where the children would locate in relative terms</li>
 </ul>
</ul>

]

<center>Absolute mobility: $\widehat{\alpha} + 25\times\hat\beta_{RRC}$</center>

<ul>
 <li>The fitted value at $x = 25$</li>
 <ul>
 <li>Expected percentile rank for children whose parents locate at the 25th percentile</li>
 <li>The estimated percentile indicates where the children would locate in absolute terms</li>
 </ul>
</ul>
]

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.3. Spatial variations

<ul>
 <li>Here is an illustration on the national-level relationship:</li>
 <ul>
 <li></li>
 <li></li>
 </ul>
</ul>

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.3. Spatial variations

<ul>
 <li>Here is an illustration on the national-level relationship:</li>
 <ul>
 <li>The relative mobility is the slope - the rank-rank correlation</li>
 <li></li>
 </ul>
</ul>

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.3. Spatial variations

<ul>
 <li>Here is an illustration on the national-level relationship:</li>
 <ul>
 <li>The relative mobility is the slope - the rank-rank correlation</li>
 <li>The absolute mobility is the fitted value for x = 25</li>
 </ul>
</ul>

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.3. Spatial variations

]

<ul>
 <li>Authors compute intergenerational persistence in each commuting zone separately</li>
</ul>

<ul>
 <li>And plot the results on a map</li>
</ul>

]

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.3. Spatial variations

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.4. Correlational analysis

<ul>
 <li>Then they investigate whether local characteristics of commuting zones are related to mobility</li>
 <li>But regressing directly upward mobility on different characteristics would give:</li>
 <ul>
 <li>Lower coefficients for variables with bigger metrics (test scores)</li>
 <li>Higher coefficients for variables with smaller metrics (fraction of single mothers)</li>
 </ul>
</ul>

<ul>
 <li>So authors standardize their variables for the comparability of their estimates</li>
</ul>

`$$\beta = \frac{\text{Cov}(\frac{x}{\text{SD}(x)}, \frac{y}{\text{SD}(y)})}{\text{Var}(\frac{x}{\text{SD}(x)})}$$`

.pull-left[
<ul>
 <li>To simplify this equation, you need to know that:</li>
 <ul>
 <li>$\text{Var}(aX) = a^2\text{Var}(X)$</li>
 <li>$\text{Cov}(aX, bY) =ab\text{Cov}(X, Y)$</li>
 </ul>
</ul>
]

.pull-right[
<center><a href = "https://louissirugue.github.io/metrics_on_R/cheatsheets/moments.pdf"><img style = "margin-bottom:-.5cm;" src = "moments.png" width = "150"/></a>
]

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.4. Correlational analysis

`$$\begin{align}\beta & = \frac{\text{Cov}(\frac{x}{\text{SD}(x)}, \frac{y}{\text{SD}(y)})}{\text{Var}(\frac{x}{\text{SD}(x)})}\\[1em]
& = \frac{\frac{1}{\text{SD}(x)\text{SD}(y)}\text{Cov}(x, y)}{\frac{1}{\text{SD}(x)^2}\text{Var}(x)}\\[1em]
& = \frac{\text{Cov}(x, y)}{\text{SD}(x)\text{SD}(y)}\times\frac{\text{SD}(x)^2}{\text{Var}(x)}\\[1em]
& = \text{Corr}(x, y)\end{align}$$`

<center><h4>&#10140; Standardizing variables allows to obtain a correlation coefficient from a regression</h4></center>

---

### 2. Correlational approach (Chetty et al., 2014)

#### 2.4. Correlational analysis

Note that these coefficients combine:
<ul>
 <li>A neighborhood effect</li>
 <li>A selection effect</li>
</ul>
]

---

<h3>Overview</h3>

<ul style = "margin-left:-.5cm;list-style: none">
 <li>2. Correlational approach (Chetty et al., 2014) &#10004;</li>
 <ul style = "list-style: none">
 <li>2.1. Empirical approach</li>
 <li>2.2. National results</li>
 <li>2.3. Spatial variations</li>
 <li>2.4. Correlational analysis</li>
 </ul>
</ul>

]

<ul style = "margin-left:-1cm;list-style: none"><li>4. Wrap up!</li></ul>
]

---

<h3>Overview</h3>

]

---

### 3. Structural approach (Nerlove, 1963)

#### 3.1. Motivation

<ul>
 <li>The structural approach refers to the following methodology:</li>
 <ol>
 <li>Theoretical modeling of the phenomenon of interest</li>
 <li>Expressing the model parameters as the coefficients of a regression</li>
 <li>Run the corresponding regressions on data to estimate the parameters of the model</li>
 </ol>
</ul>

<ul>
 <li>Structural papers are more and more complex on the theoretical side</li>
 <ul>
 <li>The current standards in this literature are beyond the scope of this course</li>
 <li>So we are going to explore a quite old study for this section</li>
 </ul>
</ul>

<ul>
 <li>Nerlove (1963) studies the returns to scale in the electricity supply industry</li>
 <ul>
 <li>What is the output elasticity of each input?</li>
 <li>Are the returns to scale positive or negative?</li>
 </ul>
</ul>

---

### 3. Structural approach (Nerlove, 1963)

#### 3.1. Motivation

* Nerlove (1963) assume the following production function:

`$$Y = A L^\lambda K^\kappa F^\varphi u$$`

* And the following cost function:
 
`$$C = p_LL + p_KK+p_FF$$`

`$$\begin{align}
L:& \text{ Labor input}\\
K:& \text{ Capital input}\\
F:& \text{ Fuel input}
\end{align}$$`
]
.pull-right[
<center>Output elasticities</center>

`$$\begin{align}
\lambda:& \text{ OE of labor}\\
\kappa:& \text{ OE of capital}\\
\varphi:& \text{ OE of fuel}
\end{align}$$`
]
]

`$$\begin{align}
p_L:& \text{ Wage rate}\\
p_K:& \text{ Price of capital}\\
p_F:& \text{ Price of fuel}
\end{align}$$`
]
.pull-right[
<center>Other parameters</center>

`$$\begin{align}
A:& \text{ Total factor}\\
       & \text{ productivity}\\
u:& \text{ Efficiency residual}
\end{align}$$`
]
]

---

### 3. Structural approach (Nerlove, 1963)

#### 3.1. Motivation

<ul>
 <li>Theoretically we could estimate the output elasticities directly from the production function</li>
 <ul>
 <li>The trick is to put everything in log such that the exponents become the parameters of the equation</li>
 <li>We call this transformation a log-linearization</li>
 </ul>
</ul>

`$$\begin{align}
\log(Y) & = \log\big(A L^\lambda K^\kappa F^\varphi u\big)\\
 &= \log(A) + \log\big(L^\lambda\big) + \log\big(K^\kappa\big) + \log\big(F^\varphi\big) + \log(u)\\
 &= \underbrace{\log(A)}_{\text{Constant}} + \lambda\log(L) + \kappa\log(K) + \varphi\log(F) + \underbrace{\log(u)}_{\text{Residuals}}
\end{align}$$`

<ul>
 <li>Regressing log output on the log inputs directly gives the elasticities</li>
 <ul>
 <li>But Nerlove (1963) does not have access to data on firms' inputs</li>
 <li>Still, he has data on the price of each input</li>
 <li>His solution is to derive an expression that allows to estimate the elasticities from the price</li>
 </ul>
</ul>

---

### 3. Structural approach (Nerlove, 1963)

#### 3.2. Theoretical modeling

<ul>
 <li>To simplify algebra, we're going to consider capital and labor only</li>
 <ul>
 <li>But the principle remains the same</li>
 <li>We need to solve the model by minimizing the cost constrained by the production function</li>
 </ul>
</ul>

`$$\begin{cases}
\text{min } & C = p_LL + p_KK\\
\text{s.t. } & Y = A L^\lambda K^\kappa  u
\end{cases} \,\,
\Longleftrightarrow \,\, \text{min }\,\mathcal{L} = p_LL + p_KK + \mu(Y - A L^\lambda K^\kappa  u)$$`

* Equate partial derivatives to 0

`$$\frac{\partial \mathcal{L}}{\partial L} = 0 \Leftrightarrow p_L = \mu A \lambda L^{\lambda-1}K^\kappa  u$$`
`$$\frac{\partial \mathcal{L}}{\partial K} = 0 \Leftrightarrow p_K = \mu A \kappa L^\lambda K^{\kappa-1} u$$`

`$$\frac{\partial \mathcal{L}}{\partial \mu} = 0 \Leftrightarrow Y = A L^\lambda K^\kappa  u$$`
]

* Same with `$K$` and `$L$` as functions of each other

`$$\begin{align}
\frac{p_L}{p_K} & = \frac{\mu A \lambda L^{\lambda-1}K^\kappa  u}{\mu A \kappa L^\lambda K^{\kappa-1} u}\\
 & = \frac{\lambda L^{\lambda-1}K^\kappa}{\kappa L^\lambda K^{\kappa-1}} = \frac{\lambda K}{\kappa L}
\end{align}$$`

.pull-left[
`$$L = K \frac{p_K}{p_L}\frac{\lambda}{\kappa}$$`
]
.pull-right[
`$$K = L \frac{p_L}{p_K}\frac{\kappa}{\lambda}$$`
]
]

---

### 3. Structural approach (Nerlove, 1963)

#### 3.2. Theoretical modeling

* Express `$Y$` as a function of `$L$` only and solve for `$L$`

`$$Y = A L^\lambda \bigg(L \frac{p_L}{p_K}\frac{\kappa}{\lambda}\bigg)^\kappa u$$`

`$$L^{\lambda+\kappa} =\frac{Y}{Au}\frac{1}{\bigg(\frac{p_L}{p_K}\frac{\kappa}{\lambda}\bigg)^\kappa}$$`

$$ L= \left[\frac{Y}{Au}\frac{1}{\bigg(\frac{p_L}{p_K}\frac{\kappa}{\lambda}\bigg)^\kappa}\right]^{\frac{1}{\lambda + \kappa}} = \bigg(\frac{Y}{Au}\bigg)^\frac{1}{\lambda + \kappa}\bigg(\frac{p_K}{p_L}\frac{\lambda}{\kappa}\bigg)^\frac{\kappa}{\lambda+\kappa}$$

]

* Same with `$K$`
 
`$$Y = A \bigg(K \frac{p_K}{p_L}\frac{\lambda}{\kappa}\bigg)^\lambda K^\kappa u$$`
`$$K^{\lambda+\kappa} =\frac{Y}{Au}\frac{1}{\bigg(\frac{p_K}{p_L}\frac{\lambda}{\kappa}\bigg)^\lambda}$$`

`$$K = \left[\frac{Y}{Au}\frac{1}{\bigg(\frac{p_K}{p_L}\frac{\lambda}{\kappa}\bigg)^\lambda}\right]^{\frac{1}{\lambda + \kappa}}  = \bigg(\frac{Y}{Au}\bigg)^\frac{1}{\lambda + \kappa}\bigg(\frac{p_L}{p_K}\frac{\kappa}{\lambda}\bigg)^\frac{\lambda}{\lambda+\kappa}$$`
]

---

### 3. Structural approach (Nerlove, 1963)

#### 3.2. Theoretical modeling

* Inject `$K$` and `$L$` back in the cost function and factorize

`$$C =p_L\bigg(\frac{Y}{Au}\bigg)^\frac{1}{\lambda + \kappa}\bigg(\frac{p_K}{p_L}\frac{\lambda}{\kappa}\bigg)^\frac{\kappa}{\lambda+\kappa} + p_K\bigg(\frac{Y}{Au}\bigg)^\frac{1}{\lambda + \kappa}\bigg(\frac{p_L}{p_K}\frac{\kappa}{\lambda}\bigg)^\frac{\lambda}{\lambda+\kappa}$$`

`$$C =\bigg(\frac{Y}{Au}\bigg)^\frac{1}{\lambda + \kappa}\left[p_L\bigg(\frac{p_K}{p_L}\frac{\lambda}{\kappa}\bigg)^\frac{\kappa}{\lambda+\kappa} + p_K\bigg(\frac{p_L}{p_K}\frac{\kappa}{\lambda}\bigg)^\frac{\lambda}{\lambda+\kappa}\right]$$`

`$$C =\bigg(\frac{Y}{Au}\bigg)^\frac{1}{\lambda + \kappa}\left[p_L^{1 -\frac{\kappa}{\lambda+\kappa}}p_K^{\frac{\kappa}{\lambda+\kappa} }\bigg(\frac{\lambda}{\kappa}\bigg)^\frac{\kappa}{\lambda+\kappa} + p_K^{1 - \frac{\lambda}{\lambda+\kappa}}p_L^{\frac{\lambda}{\lambda+\kappa} }\bigg(\frac{\kappa}{\lambda}\bigg)^\frac{\lambda}{\lambda+\kappa}\right]$$`

`$$C =\bigg(\frac{Y}{Au}\bigg)^\frac{1}{\lambda + \kappa}\left[p_L^{\frac{\lambda}{\lambda+\kappa}}p_K^{\frac{\kappa}{\lambda+\kappa} }\bigg(\frac{\lambda}{\kappa}\bigg)^\frac{\kappa}{\lambda+\kappa} + p_K^{\frac{\kappa}{\lambda+\kappa}}p_L^{\frac{\lambda}{\lambda+\kappa} }\bigg(\frac{\kappa}{\lambda}\bigg)^\frac{\lambda}{\lambda+\kappa}\right]$$`

---

### 3. Structural approach (Nerlove, 1963)

#### 3.2. Theoretical modeling

`$$C =\bigg(\frac{Y}{Au}\bigg)^\frac{1}{\lambda + \kappa}p_L^{\frac{\lambda}{\lambda+\kappa}}p_K^{\frac{\kappa}{\lambda+\kappa} }\left[\bigg(\frac{\lambda}{\kappa}\bigg)^\frac{\kappa}{\lambda+\kappa} +\bigg(\frac{\kappa}{\lambda}\bigg)^\frac{\lambda}{\lambda+\kappa}\right]$$`
--

* Isolate what's constant, each variable, and the residual term:

`$$C =\underbrace{\left[\frac{\big(\frac{\lambda}{\kappa}\big)^\kappa +\big(\frac{\kappa}{\lambda}\big)^\lambda}{A}\right]^\frac{1}{\lambda + \kappa}}_{\text{Constant}} \times\underbrace{Y^\frac{1}{\lambda + \kappa}}_{\text{Output}} \times\underbrace{p_L^{\frac{\lambda}{\lambda+\kappa}}}_{\text{Wage}}\times\underbrace{p_K^{\frac{\kappa}{\lambda+\kappa} }}_{\substack{\text{Price of}\\\text{Capital}}}\times\underbrace{u^\frac{1}{\lambda+\kappa}}_{\substack{\text{Residual}\\\text{term}}}$$`

---

### 3. Structural approach (Nerlove, 1963)

#### 3.3. Regression expression

<ul>
 <li>At this stage we can log-linearize the equation:</li>
</ul>

`$$\log (C) =\log \Bigg(\left[\frac{\big(\frac{\lambda}{\kappa}\big)^\kappa +\big(\frac{\kappa}{\lambda}\big)^\lambda}{A}\right]^\frac{1}{\lambda + \kappa} \Bigg) + \log \bigg(Y^\frac{1}{\lambda + \kappa}\bigg) + \log \bigg(p_L^{\frac{\lambda}{\lambda+\kappa}}\bigg) + \log \bigg(p_K^{\frac{\kappa}{\lambda+\kappa} }\bigg) + \log \bigg(u^\frac{1}{\lambda+\kappa}\bigg)$$`
--

`$$\log (C) = \underbrace{\log \Bigg(\left[\frac{\big(\frac{\lambda}{\kappa}\big)^\kappa +\big(\frac{\kappa}{\lambda}\big)^\lambda}{A}\right]^\frac{1}{\lambda + \kappa} \Bigg)}_{\alpha}  +  \underbrace{\frac{1}{\lambda + \kappa}}_{\beta}\log(Y) + \underbrace{\frac{\lambda}{\lambda+\kappa}}_{\gamma}\log(p_L) + \underbrace{\frac{\kappa}{\lambda+\kappa}}_{\delta}\log(p_K) + \underbrace{\log \bigg(u^\frac{1}{\lambda+\kappa}\bigg)}_{\varepsilon}$$`

---

### 3. Structural approach (Nerlove, 1963)

#### 3.3. Regression expression

<ul>
 <li>Finally we end up with this regression model:</li>
 <ul>
 <li>Where coefficients are composite objects of the parameters of the structural model</li>
 </ul>
</ul>

`$$\log(C)= \alpha + \beta\log(Y) + \gamma\log(p_L) + \delta\log(p_K) + \varepsilon$$`

<ul>
 <li>But note that to test for CRS, we don't even need to derive $\kappa$ and $\lambda$ explicitely</li>
</ul>

`$$\gamma = \frac{\lambda}{\lambda + \kappa} \,\,\,\, ; \,\,\,\, \delta = \frac{\kappa}{\lambda + \kappa}$$`

<ul>
 <li>Indeed, the null hypothesis for constant returns to scales writes</li>
</ul>

$$H_0: \lambda + \kappa = 1 \,\,\,\, \Leftrightarrow \,\,\,\, \frac{\lambda + \kappa}{\lambda + \kappa} = \frac{1}{1} \,\,\,\, \Leftrightarrow \,\,\,\, \frac{\lambda}{\lambda + \kappa} + \frac{\kappa}{\lambda + \kappa} = 1 \,\,\,\, \Leftrightarrow \,\,\,\, \gamma + \delta = 1 $$
---

### Practice

#### 1) Import the dataset from Nerlove (1963)

```r
library(haven)
nerlove <- read_dta("nerlove63.dta")
str(nerlove, give.attr = F)
```

```
## tibble [145 x 5] (S3: tbl_df/tbl/data.frame)
##  $ totcost: num [1:145] 0.082 0.661 0.99 0.315 0.197 ...
##  $ output : num [1:145] 2 3 4 4 5 9 11 13 13 22 ...
##  $ plabor : num [1:145] 2.09 2.05 2.05 1.83 2.12 ...
##  $ pfuel  : num [1:145] 17.9 35.1 35.1 32.2 28.6 ...
##  $ pkap   : num [1:145] 183 174 171 166 233 195 206 150 155 188 ...
```

#### 2) Estimate the parameters of this regression:

`$$\log(C)= \alpha + \beta\log(Y) + \gamma\log(p_L) + \delta\log(p_K) + \varepsilon$$`

#### 3) Use `linearHypothesis()` from the `car` package to test for CRS

---

### Solution

#### Estimate the parameters of this regression:

`$$\log(C)= \alpha + \beta\log(Y) + \gamma\log(p_L) + \delta\log(p_K) + \varepsilon$$`

```r
summary(lm(log(totcost) ~ log(output) + log(plabor) + log(pkap), nerlove))$coefficients
```

```
##                Estimate Std. Error    t value     Pr(>|t|)
## (Intercept) -3.82905582 1.87712121 -2.0398554 4.323183e-02
## log(output)  0.70706725 0.01819229 38.8663200 3.187712e-77
## log(plabor)  0.89957689 0.28572006  3.1484555 2.003726e-03
## log(pkap)    0.06079561 0.35250457  0.1724676 8.633172e-01
```

#### Use `linearHypothesis()` from the `car` package to test for CRS

```r
library(car)
linearHypothesis(lm(log(totcost) ~ log(output) + log(plabor) + log(pkap), nerlove),
                 "log(plabor) + log(pkap) = 1")
```

---

### Solution

```r
linearHypothesis(lm(log(totcost) ~ log(output) + log(plabor) + log(pkap), nerlove),
                 "log(plabor) + log(pkap) = 1")
```

```
## Linear hypothesis test
## 
## Hypothesis:
## log(plabor)  + log(pkap) = 1
## 
## Model 1: restricted model
## Model 2: log(totcost) ~ log(output) + log(plabor) + log(pkap)
## 
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1    142 24.333                           
## 2    141 24.332  1 0.0011159 0.0065  0.936
```

<ul>
 <li>The p-value is equal to 93.6%</li>
 <ul>
 <li>We cannot reject the hypothesis of constant returns to scale</li>
 <li>$\hat\gamma + \hat\delta = .96$ is not sufficiently far from $1$ to reject that $\gamma + \delta = 1$</li>
 </ul>
</ul>

---

<h3>Overview</h3>

]

.pull-right[
<ul style = "margin-left:-1cm;list-style: none">
 <li>3. Structural approach (Nerlove, 1963) &#10004;</li>
 <ul style = "list-style: none">
 <li>3.1. Motivation</li>
 <li>3.2. Theoretical modeling</li>
 <li>3.3. Regression expression</li>
 </ul>
</ul>

<ul style = "margin-left:-1cm;list-style: none"><li>4. Wrap up!</li></ul>
]

---

### 4. Wrap up!

#### Causal approach (Behaghel et al., 2015)

* Applicants resumes randomly anonymized or not before being sent to employers
 
--
 
`$$Y_i = \alpha + \beta D_i +\gamma An_i + \delta D_i\times An_i + \varepsilon_i$$`

* `$\hat{\delta}$` captures how the difference in interview rates between the minority and the majority differs between the treated and the control employers

```r
summary(lm(interview ~ minority + treatment + minority*treatment, 
           data_rct, weights = weight))$coefficients
```

<center><h4> &#10140; Self-selection issue: discriminatory employers did not enter the program </h4></center>

---

### 4. Wrap up!

#### Correlational approach (Chetty et al., 2014)

`$$\text{percentile}(y^c_i) = \alpha + \beta_{RRC}\text{percentile}(y^p_i)+\varepsilon_i$$`
--

]

**Relative mobility:** `$\widehat{\beta_{RRC}}$`  
**Absolute mobility:** `$\widehat{\alpha} + 25\times\widehat{\beta_{RRC}}$`

<ul>
 <li>Strong persitence in the United-States</li>
 <li>Large variations across commuting zones</li>
 <li>Intergenerational mobility correlated with characteristics of childhood environment</li>
</ul>
]

---

### 4. Wrap up!

#### Structural approach (Nerlove, 1963)

* **Theoretical modeling**

* **Regression expression**
 
`$$\log (C) = \underbrace{\log \Bigg(\left[\frac{\big(\frac{\lambda}{\kappa}\big)^\kappa +\big(\frac{\kappa}{\lambda}\big)^\lambda}{A}\right]^\frac{1}{\lambda + \kappa} \Bigg)}_{\alpha}  +  \underbrace{\frac{1}{\lambda + \kappa}}_{\beta}\log(Y) + \underbrace{\frac{\lambda}{\lambda+\kappa}}_{\gamma}\log(p_L) + \underbrace{\frac{\kappa}{\lambda+\kappa}}_{\delta}\log(p_K) + \underbrace{\log \bigg(u^\frac{1}{\lambda+\kappa}\bigg)}_{\varepsilon}$$`

* **Estimation**
 
`$$\log(C)= \alpha + \beta\log(Y) + \gamma\log(p_L) + \delta\log(p_K) + \varepsilon \,\,\,\, \Rightarrow \,\,\,\, H_0: \gamma + \delta = 1$$`