Stata Textbook Examples Applied Logistic Regression by Hosmer and Lemeshow Chapter 4: Model-Building Strategies and Methods for Logistic Regression

The first example in this chapter makes use of the lowbwt.dta file.
use lowbwt
Table 4.2 -- page 93 (output edited for space)
/* create dummy variable for the variable race */
xi i.race
i.race                Irace_1-3    (naturally coded; Irace_1 omitted)

logit low

Logit estimates                                   Number of obs   =        189
LR chi2(0)      =       0.00
Prob > chi2     =          .
Log likelihood =   -117.336                       Pseudo R2       =     0.0000

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
_cons |   -.789997    .156976     -5.033   0.000      -1.097664   -.4823297
------------------------------------------------------------------------------

logit low age

Logit estimates                                   Number of obs   =        189
LR chi2(1)      =       2.76
Prob > chi2     =     0.0966
Log likelihood = -115.95598                       Pseudo R2       =     0.0118

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
age |  -.0511529   .0315138     -1.623   0.105      -.1129188    .0106129
_cons |   .3845819   .7321251      0.525   0.599      -1.050357    1.819521
------------------------------------------------------------------------------

logit low age, or

------------------------------------------------------------------------------
low | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
age |   .9501333   .0299423     -1.623   0.105       .8932232    1.010669
------------------------------------------------------------------------------

logit low lwt

Logit estimates                                   Number of obs   =        189
LR chi2(1)      =       5.98
Prob > chi2     =     0.0145
Log likelihood = -114.34533                       Pseudo R2       =     0.0255

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
lwt |  -.0140583   .0061696     -2.279   0.023      -.0261504   -.0019661
_cons |   .9983143   .7852889      1.271   0.204      -.5408235    2.537452
------------------------------------------------------------------------------

logit low lwt, or

------------------------------------------------------------------------------
low | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
lwt |   .9860401   .0060834     -2.279   0.023       .9741886    .9980358
------------------------------------------------------------------------------

logit low Irace_2

Logit estimates                                   Number of obs   =        189
LR chi2(1)      =       1.65
Prob > chi2     =     0.1985
Log likelihood = -116.50935                       Pseudo R2       =     0.0070

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
Irace_2 |   .5635762   .4325561      1.303   0.193      -.2842181     1.41137
_cons |  -.8737311     .17184     -5.085   0.000      -1.210531   -.5369309
------------------------------------------------------------------------------

logit low Irace_2, or

------------------------------------------------------------------------------
low | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
Irace_2 |   1.756944    .759977      1.303   0.193       .7526025    4.101573
------------------------------------------------------------------------------

logit low Irace_3

Logit estimates                                   Number of obs   =        189
LR chi2(1)      =       1.77
Prob > chi2     =     0.1829
Log likelihood = -116.44906                       Pseudo R2       =     0.0076

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
Irace_3 |   .4321825   .3233953      1.336   0.181      -.2016606    1.066026
_cons |  -.9509763   .2019289     -4.709   0.000       -1.34675   -.5552028
------------------------------------------------------------------------------

logit low Irace_3, or

------------------------------------------------------------------------------
low | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
Irace_3 |   1.540616    .498228      1.336   0.181       .8173723    2.903816
------------------------------------------------------------------------------

logit low smoke

Logit estimates                                   Number of obs   =        189
LR chi2(1)      =       4.87
Prob > chi2     =     0.0274
Log likelihood =  -114.9023                       Pseudo R2       =     0.0207

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
smoke |   .7040592   .3196386      2.203   0.028       .0775791    1.330539
_cons |  -1.087051   .2147299     -5.062   0.000      -1.507914   -.6661886
------------------------------------------------------------------------------

logit low smoke, or

------------------------------------------------------------------------------
low | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
smoke |   2.021944   .6462912      2.203   0.028       1.080668    3.783083
------------------------------------------------------------------------------

logit low ptl

Logit estimates                                   Number of obs   =        189
LR chi2(1)      =       6.78
Prob > chi2     =     0.0092
Log likelihood = -113.94631                       Pseudo R2       =     0.0289

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
ptl |   .8018058   .3171535      2.528   0.011       .1801964    1.423415
_cons |   -.964189   .1749607     -5.511   0.000      -1.307106   -.6212722
------------------------------------------------------------------------------

logit low ptl, or

------------------------------------------------------------------------------
low | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
ptl |   2.229563   .7071138      2.528   0.011       1.197453    4.151274
------------------------------------------------------------------------------

logit low ht

Logit estimates                                   Number of obs   =        189
LR chi2(1)      =       4.02
Prob > chi2     =     0.0449
Log likelihood = -115.32493                       Pseudo R2       =     0.0171

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
ht |   1.213542   .6083485      1.995   0.046       .0212011    2.405883
_cons |    -.87707   .1650175     -5.315   0.000      -1.200498   -.5536417
------------------------------------------------------------------------------

logit low ht, or

------------------------------------------------------------------------------
low | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
ht |   3.365385   2.047327      1.995   0.046       1.021427    11.08822
------------------------------------------------------------------------------

logit low ui

Logit estimates                                   Number of obs   =        189
LR chi2(1)      =       5.08
Prob > chi2     =     0.0243
Log likelihood = -114.79795                       Pseudo R2       =     0.0216

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
ui |   .9469277   .4167734      2.272   0.023       .1300669    1.763789
_cons |  -.9469277   .1756215     -5.392   0.000       -1.29114   -.6027159
------------------------------------------------------------------------------

logit low ui, or

------------------------------------------------------------------------------
low | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
ui |   2.577778   1.074349      2.272   0.023       1.138905      5.8345
------------------------------------------------------------------------------

logit low ftv

Logit estimates                                   Number of obs   =        189
LR chi2(1)      =       0.77
Prob > chi2     =     0.3792
Log likelihood = -116.94943                       Pseudo R2       =     0.0033

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
ftv |  -.1351199   .1566986     -0.862   0.389      -.4422435    .1720037
_cons |  -.6867585   .1948119     -3.525   0.000      -1.068583   -.3049343
------------------------------------------------------------------------------

logit low ftv, or

------------------------------------------------------------------------------
low | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
ftv |   .8736112   .1368936     -0.862   0.389       .6425932    1.187682
------------------------------------------------------------------------------
Table 4.3 -- page 94
logit low  age lwt Irace_2 Irace_3 smoke ptl ht ui

Iteration 0:   log likelihood =   -117.336
Iteration 1:   log likelihood = -101.38735
Iteration 2:   log likelihood = -100.72104
Iteration 3:   log likelihood = -100.71348
Iteration 4:   log likelihood = -100.71348

Logit estimates                                   Number of obs   =        189
LR chi2(8)      =      33.25
Prob > chi2     =     0.0001
Log likelihood = -100.71348                       Pseudo R2       =     0.1417

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
age |  -.0270698   .0364526     -0.743   0.458      -.0985156     .044376
lwt |  -.0151826   .0069279     -2.192   0.028       -.028761   -.0016041
Irace_2 |   1.263219   .5264677      2.399   0.016       .2313616    2.295077
Irace_3 |   .8616351   .4391975      1.962   0.050       .0008239    1.722446
smoke |   .9233492   .4008583      2.303   0.021       .1376813    1.709017
ptl |   .5417551   .3462666      1.565   0.118      -.1369149    1.220425
ht |   1.833696     .69177      2.651   0.008       .4778514     3.18954
ui |   .7585965   .4593918      1.651   0.099      -.1417949    1.658988
_cons |   .4644033   1.204702      0.385   0.700      -1.896769    2.825576
------------------------------------------------------------------------------
Table 4.4 -- page 95
logit low lwt Irace_2 Irace_3 smoke ptl ht ui

Iteration 0:   log likelihood =   -117.336
Iteration 1:   log likelihood = -101.58398
Iteration 2:   log likelihood = -100.99797
Iteration 3:   log likelihood = -100.99279
Iteration 4:   log likelihood = -100.99279

Logit estimates                                   Number of obs   =        189
LR chi2(7)      =      32.69
Prob > chi2     =     0.0000
Log likelihood = -100.99279                       Pseudo R2       =     0.1393

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
lwt |  -.0159053   .0068553     -2.320   0.020      -.0293414   -.0024691
Irace_2 |   1.325719   .5222464      2.538   0.011       .3021351    2.349304
Irace_3 |   .8970779   .4338846      2.068   0.039       .0466797    1.747476
smoke |   .9387268   .3987195      2.354   0.019       .1572509    1.720203
ptl |   .5032149   .3412323      1.475   0.140      -.1655881    1.172018
ht |   1.855042   .6951214      2.669   0.008       .4926286    3.217455
ui |   .7856975   .4564423      1.721   0.085       -.108913    1.680308
_cons |  -.0865495    .951768     -0.091   0.928      -1.951981    1.778882
------------------------------------------------------------------------------
Table 4.8 -- page 98
/* create dichotomous variable for pptl and lwt */
generate lwd = (lwt<110)
generate ptd = (ptl~=0)

logit low age lwd Irace_2 Irace_3 smoke ptd ht ui

Iteration 0:   log likelihood =   -117.336
Iteration 1:   log likelihood = -99.431174
Iteration 2:   log likelihood = -98.785718
Iteration 3:   log likelihood =    -98.778
Iteration 4:   log likelihood = -98.777998

Logit estimates                                   Number of obs   =        189
LR chi2(8)      =      37.12
Prob > chi2     =     0.0000
Log likelihood = -98.777998                       Pseudo R2       =     0.1582

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
age |  -.0464796   .0373888     -1.243   0.214      -.1197603    .0268011
lwd |   .8420615   .4055338      2.076   0.038       .0472299    1.636893
Irace_2 |   1.073456   .5150752      2.084   0.037       .0639273    2.082985
Irace_3 |    .815367   .4452979      1.831   0.067      -.0574008    1.688135
smoke |   .8071996    .404446      1.996   0.046       .0145001    1.599899
ptd |   1.281678   .4621157      2.774   0.006       .3759478    2.187408
ht |   1.435227   .6482699      2.214   0.027       .1646415    2.705813
ui |   .6576256   .4666192      1.409   0.159      -.2569313    1.572182
_cons |  -1.216781   .9556797     -1.273   0.203      -3.089878     .656317
------------------------------------------------------------------------------
Table 4.10 -- page 101
/* create interaction variables */
generate agelwd=age*lwd
generate smokelwd=smoke*lwd

logit low age Irace_2 Irace_3 smoke ht ui lwd ptd agelwd smokelwd

Iteration 0:   log likelihood =   -117.336
Iteration 1:   log likelihood = -97.135228
Iteration 2:   log likelihood =  -96.03855
Iteration 3:   log likelihood = -96.006202
Iteration 4:   log likelihood =  -96.00616

Logit estimates                                   Number of obs   =        189
LR chi2(10)     =      42.66
Prob > chi2     =     0.0000
Log likelihood =  -96.00616                       Pseudo R2       =     0.1818

------------------------------------------------------------------------------
low |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
age |  -.0839782   .0455663     -1.843   0.065      -.1732864    .0053301
Irace_2 |   1.083103   .5189153      2.087   0.037       .0660474    2.100158
Irace_3 |   .7596787   .4640335      1.637   0.102      -.1498103    1.669168
smoke |   1.153131   .4584383      2.515   0.012       .2546084    2.051653
ht |   1.359216    .661471      2.055   0.040        .062757    2.655676
ui |   .7281685   .4794797      1.519   0.129      -.2115945    1.667932
lwd |  -1.729949   1.868306     -0.926   0.354      -5.391762    1.931863
ptd |   1.231578   .4713903      2.613   0.009       .3076701    2.155486
agelwd |   .1474112   .0828594      1.779   0.075      -.0149902    .3098127
smokelwd |  -1.407375   .8186761     -1.719   0.086      -3.011951    .1972003
_cons |  -.5117544   1.087536     -0.471   0.638      -2.643286    1.619777
------------------------------------------------------------------------------
Table 4.15 -- page 113
* Stata 8 code.
sw logit low  ptl lwt ht (Irace_2 Irace_3) smoke ui age ftv, pe(.15) pr(.2) forward

* Stata 9 code and output.
stepwise, pe(.15) pr(.2) forward: logit low  ptl lwt ht (Irace_2 Irace_3) smoke ui age ftv

begin with empty model
p = 0.0115 <  0.1500  adding   ptl
p = 0.0388 <  0.1500  adding   ht
p = 0.0105 <  0.1500  adding   lwt
p = 0.0784 <  0.1500  adding   _Irace_2 _Irace_3
p = 0.0166 <  0.1500  adding   smoke
p = 0.0852 <  0.1500  adding   ui

Logistic regression                               Number of obs   =        189
LR chi2(7)      =      32.69
Prob > chi2     =     0.0000
Log likelihood = -100.99279                       Pseudo R2       =     0.1393

------------------------------------------------------------------------------
low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ptl |   .5032149   .3412323     1.47   0.140    -.1655881    1.172018
ht |   1.855042   .6951214     2.67   0.008     .4926286    3.217455
lwt |  -.0159053   .0068553    -2.32   0.020    -.0293414   -.0024691
Irace_2 |   1.325719   .5222464     2.54   0.011     .3021351    2.349304
Irace_3 |   .8970779   .4338846     2.07   0.039     .0466797    1.747476
smoke |   .9387268   .3987195     2.35   0.019     .1572509    1.720203
ui |   .7856975   .4564423     1.72   0.085     -.108913    1.680308
_cons |  -.0865495    .951768    -0.09   0.928    -1.951981    1.778882
------------------------------------------------------------------------------
The final example in this chapter makes use of the ex4-24.dta file.
use ex4-24

Table 4.24 -- page 132
list

id         y         x1         x2         x3
1.        1         0       .225       .231      1.026
2.        2         1       .487       .489      1.022
3.        3         0      -1.08      -1.07      1.074
4.        4         0       -.87       -.87      1.091
5.        5         0       -.58       -.57      1.095
6.        6         0       -.64       -.64       1.01
7.        7         0      1.614      1.619      1.087
8.        8         1       .352       .355      1.095
9.        9         0     -1.025     -1.018      1.008
10.       10         1       .929       .937      1.057
Table 4.24 -- page 132
The values of the coefficients in this table differ from those in the book due to the precision of the representation of the data and to differences in the maximum likelihood algorithms.
logit y x1

Iteration 0:   log likelihood =  -6.108643
Iteration 1:   log likelihood = -4.9146936
Iteration 2:   log likelihood = -4.8717399
Iteration 3:   log likelihood = -4.8713285
Iteration 4:   log likelihood = -4.8713284

Logit estimates                                   Number of obs   =         10
LR chi2(1)      =       2.47
Prob > chi2     =     0.1157
Log likelihood = -4.8713284                       Pseudo R2       =     0.2026

------------------------------------------------------------------------------
y |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
x1 |   1.380311   1.005368      1.373   0.170      -.5901735    3.350795
_cons |  -1.001712   .8294342     -1.208   0.227      -2.627373    .6239494
------------------------------------------------------------------------------

logit y x1 x2

Iteration 0:   log likelihood =  -6.108643
Iteration 1:   log likelihood = -4.8196088
Iteration 2:   log likelihood = -4.7259858
Iteration 3:   log likelihood = -4.7212724
Iteration 4:   log likelihood =  -4.721251

Logit estimates                                   Number of obs   =         10
LR chi2(2)      =       2.77
Prob > chi2     =     0.2497
Log likelihood =  -4.721251                       Pseudo R2       =     0.2271

------------------------------------------------------------------------------
y |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
x1 |    146.395   276.9998      0.529   0.597      -396.5147    689.3047
x2 |  -144.9098   276.6344     -0.524   0.600      -687.1033    397.2836
_cons |  -.3703049   1.362457     -0.272   0.786      -3.040672    2.300062
------------------------------------------------------------------------------

logit y x3

Iteration 0:   log likelihood =  -6.108643
Iteration 1:   log likelihood =  -6.104626
Iteration 2:   log likelihood = -6.1046254

Logit estimates                                   Number of obs   =         10
LR chi2(1)      =       0.01
Prob > chi2     =     0.9286
Log likelihood = -6.1046254                       Pseudo R2       =     0.0007

------------------------------------------------------------------------------
y |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
x3 |   1.787782   19.97092      0.090   0.929      -37.35449    40.93006
_cons |  -2.736861   21.12781     -0.130   0.897       -44.1466    38.67288
------------------------------------------------------------------------------

logit y x1 x2 x3

Iteration 0:   log likelihood =  -6.108643
Iteration 1:   log likelihood = -4.8146158
Iteration 2:   log likelihood = -4.7162391
Iteration 3:   log likelihood =   -4.71065
Iteration 4:   log likelihood = -4.7106177

Logit estimates                                   Number of obs   =         10
LR chi2(3)      =       2.80
Prob > chi2     =     0.4242
Log likelihood = -4.7106177                       Pseudo R2       =     0.2289

------------------------------------------------------------------------------
y |      Coef.   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
x1 |   142.9769   282.1838      0.507   0.612      -410.0932     696.047
x2 |  -141.4593    281.832     -0.502   0.616      -693.8398    410.9212
x3 |  -3.621109   24.95384     -0.145   0.885      -52.52974    45.28752
_cons |    3.42313    26.1558      0.131   0.896       -47.8413    54.68756
------------------------------------------------------------------------------

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.