Chapter 3 Negative Binomial Regression Model
Lesson: When overdispersion occurs, we could estimate a Negative Binomial Regression model. We’ll see that our standard errors might be too large if our main Poisson assumption mean=variance fails.
Negative Binomial
We use the nbreg command to estimate a negative binomial regression.
Fitting Poisson model:
Iteration 0: log likelihood = -2249.0104
Iteration 1: log likelihood = -2248.7614
Iteration 2: log likelihood = -2248.7611
Iteration 3: log likelihood = -2248.7611
Fitting constant-only model:
Iteration 0: log likelihood = -2297.3847
Iteration 1: log likelihood = -2291.4767
Iteration 2: log likelihood = -2290.6926
Iteration 3: log likelihood = -2290.6903
Iteration 4: log likelihood = -2290.6903
Fitting full model:
Iteration 0: log likelihood = -2181.2355
Iteration 1: log likelihood = -2158.2716
Iteration 2: log likelihood = -2157.6284
Iteration 3: log likelihood = -2157.628
Iteration 4: log likelihood = -2157.628
Negative binomial regression Number of obs = 2,725
LR chi2(9) = 266.12
Dispersion = mean Prob > chi2 = 0.0000
Log likelihood = -2157.628 Pseudo R2 = 0.0581
------------------------------------------------------------------------------
narr86 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | -.4770963 .1033295 -4.62 0.000 -.6796183 -.2745743
avgsen | -.0173385 .0261171 -0.66 0.507 -.0685272 .0338501
tottime | .0197394 .0192325 1.03 0.305 -.0179557 .0574344
ptime86 | -.1073997 .025074 -4.28 0.000 -.1565439 -.0582555
qemp86 | -.0504884 .0351857 -1.43 0.151 -.1194511 .0184743
inc86 | -.0077126 .0011465 -6.73 0.000 -.0099596 -.0054656
black | .6560406 .0923594 7.10 0.000 .4750195 .8370617
hispan | .5048465 .0895663 5.64 0.000 .3292998 .6803932
born60 | -.046412 .0776384 -0.60 0.550 -.1985804 .1057564
_cons | -.5637368 .0827121 -6.82 0.000 -.7258495 -.4016242
-------------+----------------------------------------------------------------
/lnalpha | -.0738912 .1177617 -.3046999 .1569175
-------------+----------------------------------------------------------------
alpha | .9287728 .1093739 .7373446 1.169899
------------------------------------------------------------------------------
LR test of alpha=0: chibar2(01) = 182.27 Prob >= chibar2 = 0.000
First, our coefficients are the change in expected log counts, so we need some transformations to provide interpretations.
Second, our test for over dispersion can be seen at the bottom of our NBRM output with the LR test of \(\alpha = 0\)
Factor Change
We can manually calculate \(e^{\beta}\) to interpret the results with a factor change, or we can use the listcoef command:
nbreg (N=2725): Percentage Change in Expected Count
Observed SD: .85907678
------------------------------------------------------------------
narr86 | b z P>|z| % %StdX SDofX
---------+--------------------------------------------------------
pcnv | -0.47710 -4.617 0.000 -37.9 -17.2 0.3952
avgsen | -0.01734 -0.664 0.507 -1.7 -5.9 3.5080
tottime | 0.01974 1.026 0.305 2.0 9.5 4.6070
ptime86 | -0.10740 -4.283 0.000 -10.2 -18.9 1.9501
qemp86 | -0.05049 -1.435 0.151 -4.9 -7.8 1.6104
inc86 | -0.00771 -6.727 0.000 -0.8 -40.2 66.6272
black | 0.65604 7.103 0.000 92.7 27.3 0.3677
hispan | 0.50485 5.637 0.000 65.7 23.2 0.4127
born60 | -0.04641 -0.598 0.550 -4.5 -2.2 0.4808
---------+--------------------------------------------------------
ln alpha | -0.07389 -0.627
------------------------------------------------------------------
b = raw coefficient
z = z-score for test of b=0
P>|z| = p-value for z-test
% = percent change in expected count for unit increase in X
%StdX = percent change in expected count for SD increase in X
SDofX = standard deviation of X
nbreg (N=2725): Factor Change in Expected Count
Observed SD: .85907678
------------------------------------------------------------------
narr86 | b z P>|z| e^b e^bStdX SDofX
---------+--------------------------------------------------------
pcnv | -0.47710 -4.617 0.000 0.6206 0.8282 0.3952
avgsen | -0.01734 -0.664 0.507 0.9828 0.9410 3.5080
tottime | 0.01974 1.026 0.305 1.0199 1.0952 4.6070
ptime86 | -0.10740 -4.283 0.000 0.8982 0.8110 1.9501
qemp86 | -0.05049 -1.435 0.151 0.9508 0.9219 1.6104
inc86 | -0.00771 -6.727 0.000 0.9923 0.5982 66.6272
black | 0.65604 7.103 0.000 1.9271 1.2728 0.3677
hispan | 0.50485 5.637 0.000 1.6567 1.2316 0.4127
born60 | -0.04641 -0.598 0.550 0.9546 0.9779 0.4808
---------+--------------------------------------------------------
ln alpha | -0.07389 -0.627
------------------------------------------------------------------
b = raw coefficient
z = z-score for test of b=0
P>|z| = p-value for z-test
e^b = exp(b) = factor change in expected count for unit increase in X
e^bStdX = exp(b*SD of X) = change in expected count for SD increase in X
SDofX = standard deviation of X
Let’s look at pcnv:
-.04658976
A 10 percentage point increase in proportion of prior arrests that lead to conviction decrease the expected number of arrests by 4.7% holding all others constant.
Let’s look at avgsen:
.01993612
a 1 month increase in total time served in prision from prior convictions before 1986 increases the expected number of arrests holding all others constant.
(1) (2)
PRM NBRM
--------------------------------------------
narr86
pcnv -0.402*** -0.477***
(-4.73) (-4.62)
avgsen -0.0238 -0.0173
(-1.19) (-0.66)
tottime 0.0245 0.0197
(1.66) (1.03)
ptime86 -0.0986*** -0.107***
(-4.76) (-4.28)
qemp86 -0.0380 -0.0505
(-1.31) (-1.43)
inc86 -0.00808*** -0.00771***
(-7.76) (-6.73)
black 0.661*** 0.656***
(8.95) (7.10)
hispan 0.500*** 0.505***
(6.76) (5.64)
born60 -0.0510 -0.0464
(-0.80) (-0.60)
_cons -0.600*** -0.564***
(-8.92) (-6.82)
--------------------------------------------
lnalpha
_cons -0.0739
(-0.63)
--------------------------------------------
N 2725 2725
--------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
Marginal Effects
Interpretation is similar to Poisson The marginal change in a continuous variable depends on the expected value of y given x, so we have to specify x with atmeans or average marginal effects
Average marginal effects Number of obs = 2,725
Model VCE : OIM
Expression : Predicted number of events, predict()
dy/dx w.r.t. : pcnv avgsen tottime ptime86 qemp86 inc86 black hispan born60
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | -.1933482 .0429858 -4.50 0.000 -.2775988 -.1090976
avgsen | -.0070266 .010587 -0.66 0.507 -.0277767 .0137235
tottime | .0079996 .0078006 1.03 0.305 -.0072894 .0232886
ptime86 | -.0435248 .0103794 -4.19 0.000 -.0638682 -.0231815
qemp86 | -.0204609 .0143136 -1.43 0.153 -.0485151 .0075932
inc86 | -.0031256 .0004821 -6.48 0.000 -.0040705 -.0021807
black | .2658672 .0395274 6.73 0.000 .1883949 .3433395
hispan | .2045942 .0375363 5.45 0.000 .1310244 .2781641
born60 | -.0188089 .0314753 -0.60 0.550 -.0804994 .0428815
------------------------------------------------------------------------------
Incident Rate Ratios
We can also use the option irr to calcualte incident-rate ratios.
Fitting Poisson model:
Iteration 0: log likelihood = -2249.0104
Iteration 1: log likelihood = -2248.7614
Iteration 2: log likelihood = -2248.7611
Iteration 3: log likelihood = -2248.7611
Fitting constant-only model:
Iteration 0: log likelihood = -2297.3847
Iteration 1: log likelihood = -2291.4767
Iteration 2: log likelihood = -2290.6926
Iteration 3: log likelihood = -2290.6903
Iteration 4: log likelihood = -2290.6903
Fitting full model:
Iteration 0: log likelihood = -2181.2355
Iteration 1: log likelihood = -2158.2716
Iteration 2: log likelihood = -2157.6284
Iteration 3: log likelihood = -2157.628
Iteration 4: log likelihood = -2157.628
Negative binomial regression Number of obs = 2,725
LR chi2(9) = 266.12
Dispersion = mean Prob > chi2 = 0.0000
Log likelihood = -2157.628 Pseudo R2 = 0.0581
------------------------------------------------------------------------------
narr86 | IRR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pcnv | .6205828 .0641245 -4.62 0.000 .5068104 .7598956
avgsen | .9828109 .0256682 -0.66 0.507 .9337681 1.03443
tottime | 1.019935 .0196159 1.03 0.305 .9822046 1.059116
ptime86 | .8981666 .0225207 -4.28 0.000 .8550939 .9434089
qemp86 | .950765 .0334533 -1.43 0.151 .8874074 1.018646
inc86 | .9923171 .0011377 -6.73 0.000 .9900898 .9945493
black | 1.927147 .1779901 7.10 0.000 1.608046 2.309571
hispan | 1.656731 .1483873 5.64 0.000 1.389994 1.974654
born60 | .9546486 .0741174 -0.60 0.550 .8198939 1.111551
_cons | .5690785 .0470697 -6.82 0.000 .4839133 .6692322
-------------+----------------------------------------------------------------
/lnalpha | -.0738912 .1177617 -.3046999 .1569175
-------------+----------------------------------------------------------------
alpha | .9287728 .1093739 .7373446 1.169899
------------------------------------------------------------------------------
LR test of alpha=0: chibar2(01) = 182.27 Prob >= chibar2 = 0.000