Stata Library
Logistic Regression Troubleshooting and Ologit Interpretation


This Library page comes from an answer from the StataList newgroup, and is courtesy of William Gould of Stata Corporation.  We are grateful for the ability to reproduce this answer at our site.


Mario Nosvelli <nosvelli@i...> showed results from running -ologit-
and asked two questions, 

> 1) How to consider such a LR chi2(9)= 12603.75 with Prob > chi2 = 0.0000:
>    it is too good to be true....?
>
> 2) How to intepret correctly coefficient in explaining my dependent
>    variable?

I suspect Mario will not get too many answers to his first question because
not many will be willing to go out a limb.  One can, without fear of
contradiction, look at results that exhibit an obvious problem and say "no
good", but when the results look fine, one hesitates to say "All is fine".
Who knows what is lurking behind the numbers?

Nevertheless, I will go out on the limb and reassure Mario.  I think all is
fine, assuming Mario has checked out his data in all the obvious ways.  In
this case, what would most concern me (and still, I'm not concerned much), is
outliers.  I would like to be reassured that, say, edtime mostly takes on the
values 0 to 8 but for one observation in the data takes on the value 10,000.
Basically, I have only the concerns I would have whenever I looked at an
estimated model:  I want to convince myself that these results are not being
determined by just a handful of observations in the data for which another
explanation (for instance, data error) is a more likely explanation.

Mario concerns were raised by the reported LR chi2 for his model.  The 
results Mario showed were

==============================================================================
ologit  profribalt  edtime edtime2 lingua10 info10  forma10r ptr10r 
>         espe10 tenure10 compo10

Iteration 0:   log likelihood = -23409.295
[...]
Iteration 5:   log likelihood = -17107.421

Ordered logit estimates                           Number of obs   =      13341
                                                  LR chi2(9)      =   12603.75
                                                  Prob > chi2     =     0.0000
Log likelihood = -17107.421                       Pseudo R2       =     0.2692

------------------------------------------------------------------------------
  profribalt |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      edtime |    .645283   .0223021    28.93   0.000     .6015717    .6889944
     edtime2 |  -.0057201   .0019815    -2.89   0.004    -.0096037   -.0018365
    lingua10 |   .1485893   .0535689     2.77   0.006     .0435962    .2535823
      info10 |   .1644362   .0621983     2.64   0.008     .0425298    .2863425
    forma10r |  -.2904943   .0387972    -7.49   0.000    -.3665354   -.2144532
      ptr10r |  -.2938893   .0395879    -7.42   0.000    -.3714801   -.2162986
      espe10 |   .4393347   .0340193    12.91   0.000     .3726581    .5060114
    tenure10 |   .0870511   .0343935     2.53   0.011     .0196411    .1544612
     compo10 |   .7923367   .0910826     8.70   0.000     .6138181    .9708553
-------------+----------------------------------------------------------------
       _cut1 |  -1.931222   .0479324          (Ancillary parameters)
       _cut2 |    .729953   .0381873 
       _cut3 |    2.42457   .0465623 
       _cut4 |   2.525509   .0472749 
       _cut5 |   3.572923   .0541766 
       _cut6 |   6.480454    .071594 
       _cut7 |   8.203291   .0884657 
------------------------------------------------------------------------------
==============================================================================


The LR chi2 is a test that all the coefficients (with the exception of the
cutponits) are zero. The value LR chi2(9) = 12,604 is admittedly whopping and
Mario is right to raise red flags.  It is worth some thought.  Nevertheless,
such unbelievable values to arise in large datasets.

What reassured me that there was no problem was the reported log likelihood
value of -17,107 for 13,341 observations.  I said to myself, "Mario has 8
outcome catagories, so if I had no idea to which category an observation
belonged, I would use a probability of 1/8 = .125 for each.  On average,
Mario's model says that for an observation, the probability of being observed
what was observed conditional on the estimates is exp(-17107/13341)=.277.
That number is larger practically larger than .125, and that's good news
because it means that Mario's model has explanatory power.  That number is far
enough from 1 that Mario does not have to explain to me why his model does
such a good job.

    --------------------------------------------------------------------------
    ASIDE:  The exp(-17107/13341) arises like this:

    Ordered logit is a discrete model, meaning likelihoods are probablities.
    The overall likelihood of the data is

         L(Data|estimates) = L(data_1|estimates) * L(data_2|estimates)  * ...
                           = p(o_1|X_1,estimates)* p(o_2|X_2,estimates) * ...

    where o_j and X_j are the outcome and explanatory variables for
    observation j, and p(o_j|X_j,estimates) means the probability that outcome
    o_j is observed conditional on the values of X_j observed along with the
    estimated coefficients.

    The log-likelihood reported by Stata is just the log of the above.  Thus,
    the geometric average of p(o_j|X_j,estimates) is just
    exp(overall_log_likelihood/number_of_observations).

    The above works for ordered logit, logit, and other discrete models, 
    but you cannot interprete likelihoods of continous models as probabilities;
    they are densities.  You make the calculation above for some continuous
    models and data combinations, and you will get a result greater than 1.
    There is nothing wrong with making an average density calculation, you 
    just have to know how to interpret what you get.
    --------------------------------------------------------------------------

Okay, so that was the first thing I did to reassure myself that Mario's model
was okay.  Actually, that was the second thing.  The first thing I did was
look at the output and looked at the reported standard errors and z
statistics, looking to be sure that no standard error was reported to be
absurdly small (no z absurdly large because z=coef/se).  At that point, I was
just looking for what looked like calculations gone awry, such as a standard
error equal to . or 1e-300, or significance levels of . or 1e+300.

Like Mario, the only bothersome result I saw was the LR chi2, and the 
calculation I did above reassured me that this is just one of those cases 
where the LR chi2 produces unbeliveable values.

Concerning that, let me remind Mario, As a first approximation, STATISTICAL
ESTIMATES PROVIDE A THEORETICAL LOWER BOUND ON THE LEVEL OF UNCERTAINTY.  The
statistical results are exactly right if all the assumptions are met, but you
know you do not believe that.  Is your specification correct?  Is it really
education squared and not, say education to the 2.1 power?  Is the
distribution of the outcome really the logistic and not something else?
Uncertainty is lurking everywhere you look, and our model-summary statistics
measure only the role of chance conditional on our assumptions.  For small
sample sizes, one can quite reasonably argue that the uncertainty we do
measure is the most important.  As sample sizes get larger, the relative role
of these other uncertainties becomes more important.


Concerning Mario's second question, 

> 2) How to intepret correctly coefficient in explaining my dependent
>    variable?

the first thing to say is obvious:  positive coefficients increase the chances
that the subject will be observed in a higher category, and negative
coefficients increase the canges the subject will e observed in a lower
category.

Actually, ordered logit can be interpreted much like logit and, on that score,
it is unfortunate that Stata does not output the exponentiated coefficients.
They are, however, easy enough to calculate yourself, but here's a trick
for getting Stata to calculate them for you:

ologit ...
mat b = e(b)
mat v = e(V)
est post b v 
est di, eform(OR)

If you do that, ignore the output for the cutpoints.

Anyway, let's begin with logit.  The exponentiated coefficients in logit can
be interpeted as odds ratios for a 1-unit change in the corresponding
variable.  The emphasis here is on ratio:  exp(b) is the odds conditional on
x+1 dividied by the odds conditional on x.  .5 means the odds increase 50
percent if x increases by 1.

The ordered-logit model is also known as the proportional odds model.  Let's
call the outcome variable Y.  In this model, if one considers odds(k) =
P(Y<=k)/P(Y>k), then odds(k_1)/odds(k_2) is a constant for all values of k_1
and k_2.  An implication of this is that exponentiated coefficients can be
thought of as the odds ratio of being in a higher category for a one-unit
change in the variable.

You may find this logic transparent, but I admit I find it confusing.  I have 
to think really hard, think I get it, and then get confused again.  So let 
me tell you another way to think about it.

Let's put aside the ordered logit model for a minute.  We have, let us assume,
eight outcomes.  We could analyze this data by looking at the probability of
being in outcome Y==1 versus outcomes Y==2, Y==3, and so on.  We could just
use ordinary logistic regression to do that:

gen outcome = Y>1
logistic outcome ...

That would be an inefficient way of analyzing our data, but we could do that.

Similarly, we could analyze our data by looking at the proability of being 
in outcomes Y==1 or Y==2 versus Y==3, Y==4, and so on:

gen outcome2 = Y>2
logistic outcome2 ...

And similarly we could look at the five other other binary-outcome analyses:
Y<=3 versus Y>3, Y<=4 versus Y>4, Y<=5 versus Y>5, Y<=6 versus Y>6, and Y<=7
versus Y>7.

Ordered logit amounts to doing just that, but adds the constraint that the
coefficients from each individual analysis are equal while leaving the 
intercepts free to vary.   Thus, ordered logit is logistic regression, and 
I can interpret ordered logit in exactly the same way as I interpret 
ordinary binary-outcome logistic regression.

And thus, exponentiated coefficients are odd ratios of the odds of being Y==2,
3, ..., 8 vs. Y=1, and they are odds ratios of being in Y==3, 4, ..., 8 vs.
Y==1 or 2, and so on.

-- Bill
wgould@s...
*
*   Help is available at
*   http://www.stata.com/support/statalist/faq

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.