Chapter 3 Random Effects

We will implement our Random Effects estimator in Stata. It is similar to fixed effects. We use xtreg, but we use the re option.

Remember that we need to test our main random effects assumption to see if an random effects estiamte is appropriate:

\[ Cov(x_{i,j},a_i)=0 \]

3.1 Returns to Marriage for Men

Lesson: We can test to see if random effects is an appropriate assumption.

We’ll use three methods to estimate the returns to marriage for men: Pooled OLS, Fixed Effects (Within), and Random Effects. We cannot estimate the coefficients for Black and Latinos.

We can use the wagepan data again to estimate the returns to marriage for men. We will compare the Pooled OLS, FE (Within), and Random Effects estimates

Set Panel

cd "/Users/Sam/Desktop/Econ 645/Data/Wooldridge"
use "wagepan.dta", clear
xtset nr year

/Users/Sam/Desktop/Econ 645/Data/Wooldridge

       panel variable:  nr (strongly balanced)
        time variable:  year, 1980 to 1987
                delta:  1 unit

Pooled OLS

reg lwage educ i.black i.hisp exper expersq married union i.d8*
eststo m1: quietly reg lwage educ i.black i.hisp exper expersq married union i.d8*

      Source |       SS           df       MS      Number of obs   =     4,360
-------------+----------------------------------   F(14, 4345)     =     72.46
       Model |  234.048277        14  16.7177341   Prob > F        =    0.0000
    Residual |  1002.48136     4,345  .230720682   R-squared       =    0.1893
-------------+----------------------------------   Adj R-squared   =    0.1867
       Total |  1236.52964     4,359  .283672779   Root MSE        =    .48033

------------------------------------------------------------------------------
       lwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        educ |   .0913498   .0052374    17.44   0.000     .0810819    .1016177
     1.black |  -.1392342   .0235796    -5.90   0.000    -.1854622   -.0930062
      1.hisp |   .0160195   .0207971     0.77   0.441    -.0247535    .0567925
       exper |   .0672345   .0136948     4.91   0.000     .0403856    .0940834
     expersq |  -.0024117     .00082    -2.94   0.003    -.0040192   -.0008042
     married |   .1082529   .0156894     6.90   0.000     .0774937    .1390122
       union |   .1824613   .0171568    10.63   0.000     .1488253    .2160973
       1.d81 |     .05832   .0303536     1.92   0.055    -.0011886    .1178286
       1.d82 |   .0627744   .0332141     1.89   0.059    -.0023421    .1278909
       1.d83 |   .0620117   .0366601     1.69   0.091    -.0098608    .1338843
       1.d84 |   .0904672   .0400907     2.26   0.024      .011869    .1690654
       1.d85 |   .1092463   .0433525     2.52   0.012     .0242533    .1942393
       1.d86 |   .1419596    .046423     3.06   0.002     .0509469    .2329723
       1.d87 |   .1738334    .049433     3.52   0.000     .0769194    .2707474
       _cons |   .0920558   .0782701     1.18   0.240    -.0613935    .2455051
------------------------------------------------------------------------------


Pooled OLS shows a marriage premium of:

11.432957

The Pooled OLS data are likley upward biased - self-selection into marriage and we will have positive serial correlation so we really should cluster our standard errors by the group id.

Fixed Effects (Within)

xtreg lwage educ i.black i.hisp exper expersq married union i.d8*, fe

note: educ omitted because of collinearity
note: 1.black omitted because of collinearity
note: 1.hisp omitted because of collinearity
note: 1.d87 omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =      4,360
Group variable: nr                              Number of groups  =        545

R-sq:                                           Obs per group:
     within  = 0.1806                                         min =          8
     between = 0.0005                                         avg =        8.0
     overall = 0.0635                                         max =          8

                                                F(10,3805)        =      83.85
corr(u_i, Xb)  = -0.1212                        Prob > F          =     0.0000

------------------------------------------------------------------------------
       lwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        educ |          0  (omitted)
     1.black |          0  (omitted)
      1.hisp |          0  (omitted)
       exper |   .1321464   .0098247    13.45   0.000     .1128842    .1514087
     expersq |  -.0051855   .0007044    -7.36   0.000    -.0065666   -.0038044
     married |   .0466804   .0183104     2.55   0.011     .0107811    .0825796
       union |   .0800019   .0193103     4.14   0.000     .0421423    .1178614
       1.d81 |   .0190448   .0203626     0.94   0.350    -.0208779    .0589674
       1.d82 |   -.011322   .0202275    -0.56   0.576    -.0509798    .0283359
       1.d83 |  -.0419955   .0203205    -2.07   0.039    -.0818357   -.0021553
       1.d84 |  -.0384709   .0203144    -1.89   0.058    -.0782991    .0013573
       1.d85 |  -.0432498   .0202458    -2.14   0.033    -.0829434   -.0035562
       1.d86 |  -.0273819   .0203863    -1.34   0.179    -.0673511    .0125872
       1.d87 |          0  (omitted)
       _cons |    1.02764   .0299499    34.31   0.000     .9689201    1.086359
-------------+----------------------------------------------------------------
     sigma_u |   .4009279
     sigma_e |  .35099001
         rho |  .56612236   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(544, 3805) = 9.64                   Prob > F = 0.0000

Fixed Effects model shows a marriage premium of:

4.7787041

We use estimates store to store our FE (Within) estimates to compare

estimates store femodel
eststo m2: quietly xtreg lwage educ i.black i.hisp exper expersq married union i.d8*, fe

Random Effects

We can use the theta option to find the lambda-hat GLS transformaton https://www.stata.com/manuals/xtxtreg.pdf

xtreg lwage educ i.black i.hisp exper expersq married union i.d8*, re theta
eststo m3: quietly xtreg lwage educ i.black i.hisp exper expersq married union i.d8*, re

Random-effects GLS regression                   Number of obs     =      4,360
Group variable: nr                              Number of groups  =        545

R-sq:                                           Obs per group:
     within  = 0.1799                                         min =          8
     between = 0.1860                                         avg =        8.0
     overall = 0.1830                                         max =          8

                                                Wald chi2(14)     =     957.77
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
theta          = .64291089

------------------------------------------------------------------------------
       lwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        educ |   .0918763   .0106597     8.62   0.000     .0709836    .1127689
     1.black |  -.1393767   .0477228    -2.92   0.003    -.2329117   -.0458417
      1.hisp |   .0217317   .0426063     0.51   0.610    -.0617751    .1052385
       exper |   .1057545   .0153668     6.88   0.000     .0756361    .1358729
     expersq |  -.0047239   .0006895    -6.85   0.000    -.0060753   -.0033726
     married |    .063986   .0167742     3.81   0.000     .0311091    .0968629
       union |   .1061344   .0178539     5.94   0.000     .0711415    .1411273
       1.d81 |    .040462   .0246946     1.64   0.101    -.0079385    .0888626
       1.d82 |   .0309212   .0323416     0.96   0.339    -.0324672    .0943096
       1.d83 |   .0202806    .041582     0.49   0.626    -.0612186    .1017798
       1.d84 |   .0431187   .0513163     0.84   0.401    -.0574595    .1436969
       1.d85 |   .0578155   .0612323     0.94   0.345    -.0621977    .1778286
       1.d86 |   .0919476   .0712293     1.29   0.197    -.0476592    .2315544
       1.d87 |   .1349289   .0813135     1.66   0.097    -.0244427    .2943005
       _cons |   .0235864   .1506683     0.16   0.876     -.271718    .3188907
-------------+----------------------------------------------------------------
     sigma_u |  .32460315
     sigma_e |  .35099001
         rho |  .46100216   (fraction of variance due to u_i)
------------------------------------------------------------------------------


The Random Effects model shows a marriage premium of:

6.6077497

Our lambda-hat is 0.643, which means it is closer to the FE estimator than the Pooled OLS estimator.

Hausman Test

We will use the hausman command to test the main random effects assumption \(Cov(x_{i,j},a_i)=0\). First we specify the command hausman, then we input the two models. We stored our fixed effects model estimates into femodel. We will get our latest estimates from the random effects model with “.”. We will also use the option sigmamore.

hausman femodel ., sigmamore

Note: the rank of the differenced variance matrix (5) does not equal the number of
        coefficients being tested (10); be sure this is what you expect, or there may be
        problems computing the test.  Examine the output of your estimators for anything
        unexpected and possibly consider scaling your variables so that the coefficients are
        on a similar scale.

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |    femodel         .          Difference          S.E.
-------------+----------------------------------------------------------------
       exper |    .1321464     .1057545        .0263919               .
     expersq |   -.0051855    -.0047239       -.0004616        .0001533
     married |    .0466804      .063986       -.0173057        .0074632
       union |    .0800019     .1061344       -.0261326        .0074922
       1.d81 |    .0190448      .040462       -.0214172               .
       1.d82 |    -.011322     .0309212       -.0422431               .
       1.d83 |   -.0419955     .0202806       -.0622762               .
       1.d84 |   -.0384709     .0431187       -.0815896               .
       1.d85 |   -.0432498     .0578155       -.1010653               .
       1.d86 |   -.0273819     .0919476       -.1193295               .
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test:  Ho:  difference in coefficients not systematic

                  chi2(5) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =       26.22
                Prob>chi2 =      0.0001
                (V_b-V_B is not positive definite)

We reject the null hypothesis and \(a_i\) is correlated with the explanatory variables, so the random effects model is likely not appropriate.

esttab m1 m2 m3, drop(0.black 0.hisp 0.d8*) mtitles("Pooled OLS" "Within Model" "RE Model")

display (exp(.108)-1)*100
display (exp(.0467)-1)*100
display (exp(.064)-1)*100

                      (1)             (2)             (3)   
               Pooled OLS    Within Model        RE Model   
------------------------------------------------------------
educ               0.0913***            0          0.0919***
                  (17.44)             (.)          (8.62)   

1.black            -0.139***            0          -0.139** 
                  (-5.90)             (.)         (-2.92)   

1.hisp             0.0160               0          0.0217   
                   (0.77)             (.)          (0.51)   

exper              0.0672***        0.132***        0.106***
                   (4.91)         (13.45)          (6.88)   

expersq          -0.00241**      -0.00519***     -0.00472***
                  (-2.94)         (-7.36)         (-6.85)   

married             0.108***       0.0467*         0.0640***
                   (6.90)          (2.55)          (3.81)   

union               0.182***       0.0800***        0.106***
                  (10.63)          (4.14)          (5.94)   

1.d81              0.0583          0.0190          0.0405   
                   (1.92)          (0.94)          (1.64)   

1.d82              0.0628         -0.0113          0.0309   
                   (1.89)         (-0.56)          (0.96)   

1.d83              0.0620         -0.0420*         0.0203   
                   (1.69)         (-2.07)          (0.49)   

1.d84              0.0905*        -0.0385          0.0431   
                   (2.26)         (-1.89)          (0.84)   

1.d85               0.109*        -0.0432*         0.0578   
                   (2.52)         (-2.14)          (0.94)   

1.d86               0.142**       -0.0274          0.0919   
                   (3.06)         (-1.34)          (1.29)   

1.d87               0.174***            0           0.135   
                   (3.52)             (.)          (1.66)   

_cons              0.0921           1.028***       0.0236   
                   (1.18)         (34.31)          (0.16)   
------------------------------------------------------------
N                    4360            4360            4360   
------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

11.404775

4.780762

6.6092399

We can see that the marriage premium falls from 11.4% in Pooled OLS to 4.8% in Fixed Effects. If we didn’t reject our RE model, it would have been 6.6%.

The difference between the 11.4% Pooled OLS and the 4.8% in the Within Model might comes from self-selection in marriage (they would have made more money even if they weren’t married), and employers paying married men more if marriage is a sign of stability. But, we cannot distinguish these two hypothesis with this research design.

Plot the Coefficients

quietly reg lwage educ i.black i.hisp exper expersq married union i.d8*
estimates store pooled
quietly xtreg lwage educ i.black i.hisp exper expersq married union i.d8*, fe
estimates store fe
quietly xtreg lwage educ i.black i.hisp exper expersq married union i.d8*, re theta
estimates store re
coefplot ///
    (pooled, label("{bf:Pooled OLS}") mcolor(midblue) mlcolor(cyan) ///
      ciopts(lcolor(magenta midblue))) /// options for first group
    (fe, label("{bf: Within}") mcolor(green) mlcolor(lime) ///
      ciopts(lcolor(lime green))) /// options for second group
    (re, label("{bf: Random Effects}") mcolor(yellow) mlcolor(gold) ///
      ciopts(lcolor(gold yellow))), /// options for third group
      title("Returns to Marriage for Men") ///
      keep(married) ///
      xline(0, lcolor(red) lpattern(dash) lwidth(medium)) scheme(jet_white) ///
      xtitle("{bf: Coefficients}") ///
      graphregion(margin(small)) ///
      coeflabels(married="Married") ///
      msize(large) mcolor(%85) mlwidth(medium) msymbol(circle) /// marker options
      levels(95 90) ciopts(lwidth(medthick thick) recast(rspike rcap)) ///ci options for all groups
      legend(ring(1) col(1) pos(3) size(medsmall))
    graph export "/Users/Sam/Desktop/Econ 645/Stata/week4_married_returns.png", replace

Plot the Coefficients of Interest

3.2 Airline prices and market concentration

We will assess concentration of airline on airfare. Our model:

\[ ln(fare_{i,t})=\beta_0 + \beta_1 concen_{i,t} + \beta_2 ln(dist_{i,t}) + \beta_3 (ln(dist))^2 + a_i + a_t + \varepsilon_{i,t} \]

Estimate the Pooled OLS with time binaries

cd "/Users/Sam/Desktop/Econ 645/Data/Wooldridge"
use "airfare.dta", clear
reg lfare concen ldist ldistsq i.y99 i.y00

/Users/Sam/Desktop/Econ 645/Data/Wooldridge

      Source |       SS           df       MS      Number of obs   =     4,596
-------------+----------------------------------   F(5, 4590)      =    627.18
       Model |  355.197587         5  71.0395174   Prob > F        =    0.0000
    Residual |  519.896787     4,590  .113267274   R-squared       =    0.4059
-------------+----------------------------------   Adj R-squared   =    0.4052
       Total |  875.094374     4,595  .190444913   Root MSE        =    .33655

------------------------------------------------------------------------------
       lfare |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      concen |   .3609856   .0300677    12.01   0.000     .3020384    .4199327
       ldist |  -.9018838   .1282905    -7.03   0.000    -1.153395   -.6503727
     ldistsq |   .1030514   .0097268    10.59   0.000     .0839822    .1221207
       1.y99 |   .0272979   .0121656     2.24   0.025     .0034475    .0511483
       1.y00 |   .0893211    .012169     7.34   0.000     .0654641    .1131782
       _cons |   6.219743   .4206248    14.79   0.000     5.395116     7.04437
------------------------------------------------------------------------------

What is the associated change in airfare with a 10-percentage point increase in market concentration?

sum concen

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
      concen |      4,596    .6101149     .196435      .1605          1

What does the quadratic on distance mean? Decreasing at a increasing rate - use quadratic formula for when distance on airfare is 0.

Set Panel

xtset id year

Set Panel

       panel variable:  id (strongly balanced)
        time variable:  year, 1997 to 2000
                delta:  1 unit

Estimate a Pooled OLS

est clear
eststo m1: reg lfare concen ldist ldistsq i.y99 i.y00

      Source |       SS           df       MS      Number of obs   =     4,596
-------------+----------------------------------   F(5, 4590)      =    627.18
       Model |  355.197587         5  71.0395174   Prob > F        =    0.0000
    Residual |  519.896787     4,590  .113267274   R-squared       =    0.4059
-------------+----------------------------------   Adj R-squared   =    0.4052
       Total |  875.094374     4,595  .190444913   Root MSE        =    .33655

------------------------------------------------------------------------------
       lfare |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      concen |   .3609856   .0300677    12.01   0.000     .3020384    .4199327
       ldist |  -.9018838   .1282905    -7.03   0.000    -1.153395   -.6503727
     ldistsq |   .1030514   .0097268    10.59   0.000     .0839822    .1221207
       1.y99 |   .0272979   .0121656     2.24   0.025     .0034475    .0511483
       1.y00 |   .0893211    .012169     7.34   0.000     .0654641    .1131782
       _cons |   6.219743   .4206248    14.79   0.000     5.395116     7.04437
------------------------------------------------------------------------------

Estimate a Random Effects model

eststo m2: xtreg lfare concen ldist ldistsq i.y99 i.y00, re theta

Random-effects GLS regression                   Number of obs     =      4,596
Group variable: id                              Number of groups  =      1,149

R-sq:                                           Obs per group:
     within  = 0.1282                                         min =          4
     between = 0.4179                                         avg =        4.0
     overall = 0.4030                                         max =          4

                                                Wald chi2(5)      =    1331.49
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
theta          = .83489895

------------------------------------------------------------------------------
       lfare |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      concen |    .216343   .0265686     8.14   0.000     .1642694    .2684166
       ldist |  -.8544998   .2464622    -3.47   0.001    -1.337557   -.3714428
     ldistsq |   .0977308   .0186343     5.24   0.000     .0612083    .1342532
       1.y99 |   .0255419   .0038794     6.58   0.000     .0179385    .0331454
       1.y00 |   .0870883   .0038876    22.40   0.000     .0794686    .0947079
       _cons |   6.232589   .8098919     7.70   0.000      4.64523    7.819948
-------------+----------------------------------------------------------------
     sigma_u |  .31930578
     sigma_e |   .1069025
         rho |  .89920879   (fraction of variance due to u_i)
------------------------------------------------------------------------------

Estimate a FE model

eststo m3: xtreg lfare concen ldist ldistsq i.y99 i.y00, fe

note: ldist omitted because of collinearity
note: ldistsq omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =      4,596
Group variable: id                              Number of groups  =      1,149

R-sq:                                           Obs per group:
     within  = 0.1286                                         min =          4
     between = 0.0576                                         avg =        4.0
     overall = 0.0102                                         max =          4

                                                F(3,3444)         =     169.47
corr(u_i, Xb)  = -0.2143                        Prob > F          =     0.0000

------------------------------------------------------------------------------
       lfare |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      concen |    .177761   .0294665     6.03   0.000     .1199873    .2355346
       ldist |          0  (omitted)
     ldistsq |          0  (omitted)
       1.y99 |   .0250736   .0038791     6.46   0.000      .017468    .0326791
       1.y00 |   .0864927   .0038892    22.24   0.000     .0788673    .0941181
       _cons |   4.959254   .0183174   270.74   0.000      4.92334    4.995168
-------------+----------------------------------------------------------------
     sigma_u |  .43441394
     sigma_e |   .1069025
         rho |  .94290028   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(1148, 3444) = 60.09                 Prob > F = 0.0000

Compare

esttab m1 m2 m3, mtitle("OLS" "RE" "FE")

Compare

                      (1)             (2)             (3)   
                      OLS              RE              FE   
------------------------------------------------------------
concen              0.361***        0.216***        0.178***
                  (12.01)          (8.14)          (6.03)   

ldist              -0.902***       -0.854***            0   
                  (-7.03)         (-3.47)             (.)   

ldistsq             0.103***       0.0977***            0   
                  (10.59)          (5.24)             (.)   

1.y99              0.0273*         0.0255***       0.0251***
                   (2.24)          (6.58)          (6.46)   

1.y00              0.0893***       0.0871***       0.0865***
                   (7.34)         (22.40)         (22.24)   

_cons               6.220***        6.233***        4.959***
                  (14.79)          (7.70)        (270.74)   
------------------------------------------------------------
N                    4596            4596            4596   
------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001