Chapter 2 Poisson Regression
Lesson: We should use a count model when we have a Poisson distribution for the outcome of interest. Our coefficients are expected log counts so we *need to transform our coefficients for interpretation.
We want to look at arrest data to see the number of times a man was arrested in 1986. There are 1,970 zeros out of 2,725 men in the sample and only eight observations are greater than 5. An OLS will not account for a count model but a Poisson distribution will.
We have 1. pcnv - proportion of prior arrest that lead to a conviction 2. avgsen - the average sentence served from prior convictions (in months) 3. tottime - months spent in prison since age 18 prior to 1986 4. ptime86 - months spent in prison in 1986 5. qemp86 - number of quarters that the person was legally employed in 1986 6. inc86 - income in 1986 7. Hispanic - Hispanic/Latino binary 8. Black - Black binary 9. ptime86 - prison time in 1986
\[ narr86_{t}=\alpha + \beta_1 pcnv_t + \beta_2 avgsen_t + \beta_3 tottime_t + \beta_4 ptime86_t + \beta_5 qemp86_t + \beta_6 inc86_t + x'_t\delta + \varepsilon_t \]
OLS
cd "/Users/Sam/Desktop/Econ 645/Data/Wooldridge"
use "crime1.dta", clear
reg narr86 pcnv avgsen tottime ptime86 qemp86 inc86 black hispan born60 /Users/Sam/Desktop/Econ 645/Data/Wooldridge
Source | SS df MS Number of obs = 2,725
-------------+---------------------------------- F(9, 2715) = 23.57
Model | 145.702778 9 16.1891976 Prob > F = 0.0000
Residual | 1864.64438 2,715 .686793509 R-squared = 0.0725
-------------+---------------------------------- Adj R-squared = 0.0694
Total | 2010.34716 2,724 .738012906 Root MSE = .82873
------------------------------------------------------------------------------
narr86 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | -.131886 .0404037 -3.26 0.001 -.2111112 -.0526609
avgsen | -.0113316 .0122413 -0.93 0.355 -.0353348 .0126717
tottime | .0120693 .0094364 1.28 0.201 -.006434 .0305725
ptime86 | -.0408735 .008813 -4.64 0.000 -.0581544 -.0235925
qemp86 | -.0513099 .0144862 -3.54 0.000 -.079715 -.0229047
inc86 | -.0014617 .000343 -4.26 0.000 -.0021343 -.0007891
black | .3270097 .0454264 7.20 0.000 .2379359 .4160835
hispan | .1938094 .0397156 4.88 0.000 .1159335 .2716853
born60 | -.022465 .0332945 -0.67 0.500 -.0877502 .0428202
_cons | .576566 .0378945 15.22 0.000 .502261 .6508711
------------------------------------------------------------------------------
Poisson
We use the poisson command to estimate a Poisson regression. Our coefficients are the change in expected log counts. We will need a transformation to interpret the results.
Iteration 0: log likelihood = -2249.0104
Iteration 1: log likelihood = -2248.7614
Iteration 2: log likelihood = -2248.7611
Iteration 3: log likelihood = -2248.7611
Poisson regression Number of obs = 2,725
LR chi2(9) = 386.32
Prob > chi2 = 0.0000
Log likelihood = -2248.7611 Pseudo R2 = 0.0791
------------------------------------------------------------------------------
narr86 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | -.4015713 .0849712 -4.73 0.000 -.5681117 -.2350308
avgsen | -.0237723 .019946 -1.19 0.233 -.0628658 .0153212
tottime | .0244904 .0147504 1.66 0.097 -.0044199 .0534006
ptime86 | -.0985584 .0206946 -4.76 0.000 -.1391192 -.0579977
qemp86 | -.0380187 .0290242 -1.31 0.190 -.0949051 .0188677
inc86 | -.0080807 .001041 -7.76 0.000 -.010121 -.0060404
black | .6608376 .0738342 8.95 0.000 .5161252 .80555
hispan | .4998133 .0739267 6.76 0.000 .3549196 .644707
born60 | -.0510286 .0640518 -0.80 0.426 -.1765678 .0745106
_cons | -.5995888 .0672501 -8.92 0.000 -.7313966 -.467781
------------------------------------------------------------------------------
Test \(E(y|x)=Var(y|x)\)
Deviance goodness-of-fit = 2822.185
Prob > chi2(2715) = 0.0742
Pearson goodness-of-fit = 4118.08
Prob > chi2(2715) = 0.0000
Factor Change Interpretation
\(\Delta pcnv = .01\) so
-.40119306
A 1 percentage point increase in prior arrest decreases expected number of arrests by 0.4%
A discrete change in a binary - the coefficient on Black
93.634079
This means that the expected number of arrests for a Black person is 93.7% higher than the expected number of arrests for a White person.
We can use the command listcoef as well, where \(e^{\beta}\) or \((e^{\beta}-1)*100\) depending upon the option called percent.
Marginal Effects
Interpretation: The marginal change in a continuous variable depends on the expected value of \(y\) given \(x\), so we have to calculate marginal effects by specifing x at their means or calculate average marginal effects.
Note: be careful with the change in percentage points for pcnv, we want a 1 percentage or 10 percentage point change not a change of 1 (or 100 percentage points)
Average marginal effects Number of obs = 2,725
Model VCE : OIM
Expression : Predicted number of events, predict()
dy/dx w.r.t. : pcnv avgsen tottime ptime86 qemp86 inc86 black hispan born60
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | -.1623969 .0347091 -4.68 0.000 -.2304256 -.0943682
avgsen | -.0096136 .0080714 -1.19 0.234 -.0254333 .0062061
tottime | .009904 .0059726 1.66 0.097 -.001802 .02161
ptime86 | -.0398574 .0084547 -4.71 0.000 -.0564283 -.0232865
qemp86 | -.0153749 .0117466 -1.31 0.191 -.0383979 .0076481
inc86 | -.0032679 .0004323 -7.56 0.000 -.0041152 -.0024205
black | .2672451 .0309251 8.64 0.000 .2066331 .3278571
hispan | .2021263 .03051 6.62 0.000 .1423279 .2619248
born60 | -.0206361 .0259102 -0.80 0.426 -.0714193 .030147
------------------------------------------------------------------------------
Incident-Rate Ratios
Iteration 0: log likelihood = -2249.0104
Iteration 1: log likelihood = -2248.7614
Iteration 2: log likelihood = -2248.7611
Iteration 3: log likelihood = -2248.7611
Poisson regression Number of obs = 2,725
LR chi2(9) = 386.32
Prob > chi2 = 0.0000
Log likelihood = -2248.7611 Pseudo R2 = 0.0791
------------------------------------------------------------------------------
narr86 | IRR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | .6692676 .0568685 -4.73 0.000 .5665943 .7905465
avgsen | .976508 .0194775 -1.19 0.233 .9390695 1.015439
tottime | 1.024793 .0151161 1.66 0.097 .9955899 1.054852
ptime86 | .9061427 .0187523 -4.76 0.000 .8701243 .9436521
qemp86 | .9626949 .0279415 -1.31 0.190 .9094592 1.019047
inc86 | .9919519 .0010326 -7.76 0.000 .98993 .9939778
black | 1.936414 .1429736 8.95 0.000 1.675523 2.237927
hispan | 1.648413 .1218618 6.76 0.000 1.426066 1.905429
born60 | .9502515 .0608653 -0.80 0.426 .8381419 1.077357
_cons | .5490374 .0369228 -8.92 0.000 .4812364 .6263907
------------------------------------------------------------------------------