Chapter 3 Sample Selection Correction
Lesson: Similar to tobit, when we don’t account for the truncation of the data, or why certain parts of the population are not sampled, we will commit a type of omitted variable bias without our lambda(zy)-hat. We can use a Heckit method for sample selection correction.
We want to see if there is sample selection bias due to unobservable wage offers for non-working women.
We need to estimate a logit or probit to test and correct for sample selection bias due to unobserved wage offer for nonworking women. Spousal income, education, experience, age, and number of kids less than 6. \[ ln(wages_{i})=\beta_{0}+ \mathbf{x'\beta} + u_{i} \] Where \(\mathbf{x}\) is a vector that includes education, experience, and experience squared
We use the Heckit command to implement a Heckman Method for sample selection correction.
\[ ln(wages_{i})=\beta_{0}+ \mathbf{x'\beta} +\mathbf{z'\delta} + u_{i} \] Where \(\mathbf{z}\) is a vector that includes spousal income, education, experience, age, and number of kids less than 6.
3.1 OLS
Source | SS df MS Number of obs = 428
-------------+---------------------------------- F(3, 424) = 26.29
Model | 35.0222967 3 11.6740989 Prob > F = 0.0000
Residual | 188.305144 424 .444115906 R-squared = 0.1568
-------------+---------------------------------- Adj R-squared = 0.1509
Total | 223.327441 427 .523015084 Root MSE = .66642
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .1074896 .0141465 7.60 0.000 .0796837 .1352956
exper | .0415665 .0131752 3.15 0.002 .0156697 .0674633
expersq | -.0008112 .0003932 -2.06 0.040 -.0015841 -.0000382
_cons | -.5220406 .1986321 -2.63 0.009 -.9124667 -.1316144
------------------------------------------------------------------------------
Notice: an assumption is used to exclude spousal income, age, kids less than 6, and kids greater than 6 from our main regression.
11.347933
3.2 Heckman Method - Heckit
We will use a subset of all exogenous variable 1. What are the factors correlated with being in the labor force? 2. What is the impact of education and experience on wages
use "/Users/Sam/Desktop/Econ 645/Data/Wooldridge/mroz.dta", clear
heckman lwage educ exper expersq, select(inlf=nwifeinc educ exper expersq age kidslt6 kidsge6) twostep Heckman selection model -- two-step estimates Number of obs = 753
(regression model with sample selection) Censored obs = 325
Uncensored obs = 428
Wald chi2(3) = 51.53
Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwage |
educ | .1090655 .015523 7.03 0.000 .0786411 .13949
exper | .0438873 .0162611 2.70 0.007 .0120163 .0757584
expersq | -.0008591 .0004389 -1.96 0.050 -.0017194 1.15e-06
_cons | -.5781032 .3050062 -1.90 0.058 -1.175904 .019698
-------------+----------------------------------------------------------------
inlf |
nwifeinc | -.0120237 .0048398 -2.48 0.013 -.0215096 -.0025378
educ | .1309047 .0252542 5.18 0.000 .0814074 .180402
exper | .1233476 .0187164 6.59 0.000 .0866641 .1600311
expersq | -.0018871 .0006 -3.15 0.002 -.003063 -.0007111
age | -.0528527 .0084772 -6.23 0.000 -.0694678 -.0362376
kidslt6 | -.8683285 .1185223 -7.33 0.000 -1.100628 -.636029
kidsge6 | .036005 .0434768 0.83 0.408 -.049208 .1212179
_cons | .2700768 .508593 0.53 0.595 -.7267473 1.266901
-------------+----------------------------------------------------------------
mills |
lambda | .0322619 .1336246 0.24 0.809 -.2296376 .2941613
-------------+----------------------------------------------------------------
rho | 0.04861
sigma | .66362875
------------------------------------------------------------------------------
est clear
eststo OLS: reg lwage educ exper expersq
eststo Heckman: heckman lwage educ exper expersq, select(inlf=nwifeinc educ exper expersq age kidslt6 kidsge6) twostep
esttab, mtitle (1) (2)
OLS Heckman
--------------------------------------------
main
educ 0.107*** 0.109***
(7.60) (7.03)
exper 0.0416** 0.0439**
(3.15) (2.70)
expersq -0.000811* -0.000859
(-2.06) (-1.96)
_cons -0.522** -0.578
(-2.63) (-1.90)
--------------------------------------------
inlf
nwifeinc -0.0120*
(-2.48)
educ 0.131***
(5.18)
exper 0.123***
(6.59)
expersq -0.00189**
(-3.15)
age -0.0529***
(-6.23)
kidslt6 -0.868***
(-7.33)
kidsge6 0.0360
(0.83)
_cons 0.270
(0.53)
--------------------------------------------
mills
lambda 0.0323
(0.24)
--------------------------------------------
N 428 753
--------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
There is no real evidence of sample selection bias in the wage offer equation. Our lambda-hat is not statistically significant and we fail to reject \(H_0: \rho=0\). Also, we notice very little difference between our OLS and Heckman Method.
Interpretation: we will have a similar interpretation to our OLS. We have a log-linear model so our interpretation of the return would be \(e^\beta\) or \((e^\beta-1)*100\). Our \(\hat{\lambda}\) would be a potential omitted variable, but our example shows that we fail to reject the null hypothesis and selection bias is does not appear to be problematic.