Chapter 2 Poisson Regression

Lesson: We should use a count model when we have a Poisson distribution for the outcome of interest. Our coefficients are expected log counts so we *need to transform our coefficients for interpretation.

We want to look at arrest data to see the number of times a man was arrested in 1986. There are 1,970 zeros out of 2,725 men in the sample and only eight observations are greater than 5. An OLS will not account for a count model but a Poisson distribution will.

We have 1. pcnv - proportion of prior arrest that lead to a conviction 2. avgsen - the average sentence served from prior convictions (in months) 3. tottime - months spent in prison since age 18 prior to 1986 4. ptime86 - months spent in prison in 1986 5. qemp86 - number of quarters that the person was legally employed in 1986 6. inc86 - income in 1986 7. Hispanic - Hispanic/Latino binary 8. Black - Black binary 9. ptime86 - prison time in 1986

\[ narr86_{t}=\alpha + \beta_1 pcnv_t + \beta_2 avgsen_t + \beta_3 tottime_t + \beta_4 ptime86_t + \beta_5 qemp86_t + \beta_6 inc86_t + x'_t\delta + \varepsilon_t \]

OLS

cd "/Users/Sam/Desktop/Econ 645/Data/Wooldridge"
use "crime1.dta", clear
reg narr86 pcnv avgsen tottime ptime86 qemp86 inc86 black hispan born60 
/Users/Sam/Desktop/Econ 645/Data/Wooldridge

      Source |       SS           df       MS      Number of obs   =     2,725
-------------+----------------------------------   F(9, 2715)      =     23.57
       Model |  145.702778         9  16.1891976   Prob > F        =    0.0000
    Residual |  1864.64438     2,715  .686793509   R-squared       =    0.0725
-------------+----------------------------------   Adj R-squared   =    0.0694
       Total |  2010.34716     2,724  .738012906   Root MSE        =    .82873

------------------------------------------------------------------------------
      narr86 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        pcnv |   -.131886   .0404037    -3.26   0.001    -.2111112   -.0526609
      avgsen |  -.0113316   .0122413    -0.93   0.355    -.0353348    .0126717
     tottime |   .0120693   .0094364     1.28   0.201     -.006434    .0305725
     ptime86 |  -.0408735    .008813    -4.64   0.000    -.0581544   -.0235925
      qemp86 |  -.0513099   .0144862    -3.54   0.000     -.079715   -.0229047
       inc86 |  -.0014617    .000343    -4.26   0.000    -.0021343   -.0007891
       black |   .3270097   .0454264     7.20   0.000     .2379359    .4160835
      hispan |   .1938094   .0397156     4.88   0.000     .1159335    .2716853
      born60 |   -.022465   .0332945    -0.67   0.500    -.0877502    .0428202
       _cons |    .576566   .0378945    15.22   0.000      .502261    .6508711
------------------------------------------------------------------------------

Poisson

We use the poisson command to estimate a Poisson regression. Our coefficients are the change in expected log counts. We will need a transformation to interpret the results.

poisson narr86 pcnv avgsen tottime ptime86 qemp86 inc86 black hispan born60 
Iteration 0:   log likelihood = -2249.0104  
Iteration 1:   log likelihood = -2248.7614  
Iteration 2:   log likelihood = -2248.7611  
Iteration 3:   log likelihood = -2248.7611  

Poisson regression                              Number of obs     =      2,725
                                                LR chi2(9)        =     386.32
                                                Prob > chi2       =     0.0000
Log likelihood = -2248.7611                     Pseudo R2         =     0.0791

------------------------------------------------------------------------------
      narr86 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        pcnv |  -.4015713   .0849712    -4.73   0.000    -.5681117   -.2350308
      avgsen |  -.0237723    .019946    -1.19   0.233    -.0628658    .0153212
     tottime |   .0244904   .0147504     1.66   0.097    -.0044199    .0534006
     ptime86 |  -.0985584   .0206946    -4.76   0.000    -.1391192   -.0579977
      qemp86 |  -.0380187   .0290242    -1.31   0.190    -.0949051    .0188677
       inc86 |  -.0080807    .001041    -7.76   0.000     -.010121   -.0060404
       black |   .6608376   .0738342     8.95   0.000     .5161252      .80555
      hispan |   .4998133   .0739267     6.76   0.000     .3549196     .644707
      born60 |  -.0510286   .0640518    -0.80   0.426    -.1765678    .0745106
       _cons |  -.5995888   .0672501    -8.92   0.000    -.7313966    -.467781
------------------------------------------------------------------------------

Test \(E(y|x)=Var(y|x)\)

estat gof
         Deviance goodness-of-fit =  2822.185
         Prob > chi2(2715)        =    0.0742

         Pearson goodness-of-fit  =   4118.08
         Prob > chi2(2715)        =    0.0000

Factor Change Interpretation

\(\Delta pcnv = .01\) so

display (exp(-.402*.01)-1)*100
-.40119306

A 1 percentage point increase in prior arrest decreases expected number of arrests by 0.4%

A discrete change in a binary - the coefficient on Black

display (exp(0.6608)-1)*100 
93.634079

This means that the expected number of arrests for a Black person is 93.7% higher than the expected number of arrests for a White person.

We can use the command listcoef as well, where \(e^{\beta}\) or \((e^{\beta}-1)*100\) depending upon the option called percent.

scc install listcoef
listcoef, help
listcoef, percent help

Marginal Effects

Interpretation: The marginal change in a continuous variable depends on the expected value of \(y\) given \(x\), so we have to calculate marginal effects by specifing x at their means or calculate average marginal effects.

Note: be careful with the change in percentage points for pcnv, we want a 1 percentage or 10 percentage point change not a change of 1 (or 100 percentage points)

margins, dydx(*)
Average marginal effects                        Number of obs     =      2,725
Model VCE    : OIM

Expression   : Predicted number of events, predict()
dy/dx w.r.t. : pcnv avgsen tottime ptime86 qemp86 inc86 black hispan born60

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        pcnv |  -.1623969   .0347091    -4.68   0.000    -.2304256   -.0943682
      avgsen |  -.0096136   .0080714    -1.19   0.234    -.0254333    .0062061
     tottime |    .009904   .0059726     1.66   0.097     -.001802      .02161
     ptime86 |  -.0398574   .0084547    -4.71   0.000    -.0564283   -.0232865
      qemp86 |  -.0153749   .0117466    -1.31   0.191    -.0383979    .0076481
       inc86 |  -.0032679   .0004323    -7.56   0.000    -.0041152   -.0024205
       black |   .2672451   .0309251     8.64   0.000     .2066331    .3278571
      hispan |   .2021263     .03051     6.62   0.000     .1423279    .2619248
      born60 |  -.0206361   .0259102    -0.80   0.426    -.0714193     .030147
------------------------------------------------------------------------------

Incident-Rate Ratios

poisson narr86 pcnv avgsen tottime ptime86 qemp86 inc86 black hispan born60, irr
Iteration 0:   log likelihood = -2249.0104  
Iteration 1:   log likelihood = -2248.7614  
Iteration 2:   log likelihood = -2248.7611  
Iteration 3:   log likelihood = -2248.7611  

Poisson regression                              Number of obs     =      2,725
                                                LR chi2(9)        =     386.32
                                                Prob > chi2       =     0.0000
Log likelihood = -2248.7611                     Pseudo R2         =     0.0791

------------------------------------------------------------------------------
      narr86 |        IRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        pcnv |   .6692676   .0568685    -4.73   0.000     .5665943    .7905465
      avgsen |    .976508   .0194775    -1.19   0.233     .9390695    1.015439
     tottime |   1.024793   .0151161     1.66   0.097     .9955899    1.054852
     ptime86 |   .9061427   .0187523    -4.76   0.000     .8701243    .9436521
      qemp86 |   .9626949   .0279415    -1.31   0.190     .9094592    1.019047
       inc86 |   .9919519   .0010326    -7.76   0.000       .98993    .9939778
       black |   1.936414   .1429736     8.95   0.000     1.675523    2.237927
      hispan |   1.648413   .1218618     6.76   0.000     1.426066    1.905429
      born60 |   .9502515   .0608653    -0.80   0.426     .8381419    1.077357
       _cons |   .5490374   .0369228    -8.92   0.000     .4812364    .6263907
------------------------------------------------------------------------------