Question on how Stata calculates p values in logistic regression

Jon Smith

Join Date: Jun 2014

Posts: 4
#1

Question on how Stata calculates p values in logistic regression

03 Jun 2014, 19:08

Hello.
I have a dataset of about 100 patients with 25 events. I am doing analysis of patient characteristics which are correlated with the occurrence of the event. I divided the patients into those who had the event and those who didnt (in the usual table 1 fashion), and compared the continuous variables with a rank-sum test given that when i did tests for normality (sktest, swilk) the p values were suggestive of a significantly non-normal distribution (around 0.005 or so). For some of the variables I obtained a significant p value using the rank-sum test (~0.04).but when i do a logistic regression to calculate odds ratios, the p value in logistic regression is not significant (~0.06). I realize that this is probably due to the fact that both results are of borderline significance, and 1 test is falling a little below 0.05, while the other falls just a bit above 0.05, but it seems strange to report a significant difference in table 1 using the rank sum test, but then have to say that there was a non-significant difference when using univariate odds ratios. Of note, if I use the t-test to compare the variables, the p values are just above 0.05 and seem to be close to those in the logisitc regression.

Any thoughts on how to best approach this? Should I just bite the bullet and say the rank-sum test was significant but logistic regression was not? Should i switch to a t-test even though the data probably arent normally distributed? Is there a different method of using logistic regression on a continuous variable that I should be using?

Thanks for your help.

j
Tags: None
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#2

04 Jun 2014, 07:13

Responding directly to the question implied by your subject heading: Stata calculates
p-values for logistic regression by completely conventional means. The difficulty here
is with your understanding of the underlying procedures--and I don't mean that unkindly.

You said:

>Hello. I have a dataset of about 100 patients with 25 events. I am doing analysis of patient characteristics
>which are correlated with the occurrence of the event. I divided the patients into those who had the event and
>those who didnt (in the usual table 1 fashion), and compared the continuous variables with a rank-sum test
>given that when i did tests for normality (sktest, swilk) the p values were suggestive of a significantly non-
>normal distribution (around 0.005 or so).

As a caveat, I have to say that "the usual table 1 fashion" is not a Stata-locution with which I am familiar.
That being said: What you describe doing here is probably not a good idea, considering that I presume
the event is as an *outcome.* The extent to which an outcome can predict some X variable, which
is what you are examining statistically here, is quite a different thing than whether some X variable predicts
an outcome. (Classic example: Imagine predicting whether a person dies, given that s/he has been hung,
as opposed to predicting that s/he has been hung, given that s/he has died.) Also, many of us (depending
on research area and discipline) would view the rank-sum test as a pretty 20th century way to deal with
non-normality.

>when i do a logistic regression to calculate odds ratios, the p value in logistic
>regression is not significant (~0.06). I realize that this is probably due to the fact that both results are of
>borderline significance,

See above. Also, even if you had not switched the role of response and explanatory variables here, as compared
to your first analysis, please note that different procedures will, in general, give different p-values.

>Any thoughts on how to best approach this? Should I just bite the bullet and say the rank-sum test was
>significant but logistic regression was not?

Again, in many disciplines, the idea that a p-value being marginally above or below some ritual cut-off
value marks the difference between a real effect and one that isn't is no longer well-regarded. My impression
is that this especially true in epidemiology. I'd used confidence intervals, which avoids that arbitrariness.

>Should i switch to a t-test even though the data probably arent
>normally distributed?

No, see my comments above presuming that the event is regarded as a response (outcome variable). If
there is some reason to persist in this choice of the event as a predictor, I'd suggest you consider
-bootstrap- or -permute- with -ttest- or -regress- as a way to avoid the normality assumption.

>Is there a different method of using logistic regression on a continuous variable that I
>should be using?

The only thing I might suggest is that given a relatively small (not tiny) number of events and
sample size, you might want to use the exact logistic procedure (-exlogistic-) rather than
-logit- or -logistic-, or use -bootstrap- or -permute- .

Regards,

Mike Lacy
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4464
#3

04 Jun 2014, 07:32

I wan to just touch on the statement "in the usual table 1 fashion"; if by this you are referring to, e.g., medical journals, where Table 1 is often a comparison of two groups, note that in these article the two groups are defined by differing treatments not by whether they had the event; if you are referring to something common in your discipline, please tell us what that is and maybe give some example citations
Comment
Jon Smith

Join Date: Jun 2014

Posts: 4
#4

04 Jun 2014, 17:48

Hi Mike and Rich.
Thank you for your replies. I apologize my explanation of the issue wasnt clear.

Im looking at patients who undergo a relatively uncommon medical procedure and then have an "event" post-procedure, which is the presence or absence of a specific lab abnormality which is assessed in each patient. The Table 1 is a table of baseline characteristics stratified by the lab abnormality

+ lab | - lab |
age | | |
%male | | |
weight| | |

etc...

I want to look at patient characteristics which are correlated with the development of the lab abnormality after the procedure. Is it not appropriate to look at the association of patient characteristics by assessing differences in each of these factors in univariate, and then put select characteristics in a multivariable logistic model to see what remains independently associated?

Since i only have 25 pts with + lab values, and about 75 with - lab values, Im not sure which is the best test to use for a sample of this size. I had actually been using exact logistic regression to calculate univariate ORs, but stata runs out of memory when more than a few variables are included in a exlogistic model, even when I increase the memory to the maximum of 2gb.

Im trying to rationalize how I can have a significant p value when i compare the groups in the above table with something like the rank-sum test, but then the exlogistic, or logistic model has a non-significant P. I understand how you say that a p value of 0.05 is somewhat arbitrary, and the difference between a p of 0.04 and 0.07 is really not important, and that p values of different tests are calculated in different ways, but Im trying to rationalize how to present my results given that a p of 0.05, for better or worse, is considered "significant". Should I say that there was a significant difference using the rank sum test which was just outside of statistical significance when ORs were calculated? Or should I be looking for a test which is "better" for testing the relatively small population in the dataset -- or in other words, in cases where the P of the rank sum and the P of the logistic or exlogistic regression dont agree, does this mean that one of the tests isnt appropriate for the data?

In this specific example, there is agreement between P with rank sum and P with logistic regression for all but 1 variable (age) which is continuous, and which has a Ranksum P of 0.03, and a P from logistic or exlogistic of ~0.07. But If I use a t-test, which perhaps I SHOULDNT given its not clear that the data are appropriate for the assumptions for the test (relatively small numbers and not clearly normal distribution), I get a P of 0.06 so that the T-test and logistic regression P values are both on the same side of "significance".

Im looking for guidance in terms of dealing with this. From your post it seems like neither the T-test or rank-sum are "good' options, but im not sure what else to use, or what the most statistically appropriate test is in this relatively small dataset.

Thank you for your help.....I have no formal training in statistics, and have been trying to learn this on my own (which is not easy....)

j
Comment
Jon Smith

Join Date: Jun 2014

Posts: 4
#5

04 Jun 2014, 17:50

The example table came out strange above.

+ lab | - lab |
age | | |
%male | | |
weight| | |
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4992
#6

04 Jun 2014, 18:11

The tests make different assumptions so it is not surprising they reach slightly different conclusions in a borderline case. Life would be simpler if you either (a) got another 20 or 30 cases (b) used the .10 level of significance (c) used one-tailed tests (assuming such are justified and the direction of the differences is right). If it was me I'd probably just say the tests are borderline with regards to significance/ insignificance.

To keep your table from looking weird, toggle the advanced editor (the underlined A on the right hand side as you write messages) and use the code option.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
David Hoaglin

Join Date: Apr 2014

Posts: 15
#7

04 Jun 2014, 18:52

Jon,

You asked, "Is it not appropriate to look at the association of patient characteristics by assessing differences in each of these factors in univariate, and then put select characteristics in a multivariable logistic model to see what remains independently associated?" In a word, No. You should skip your "Table 1" entirely. As Mike Lacy explained, you are defining the two groups by the outcome (whether the lab abnormality developed). If you want to assess the extent to which a continuous patient characteristic predicts the outcome, you can start with a logistic regression (or an exact logistic regression) in which that patient characteristic is the only predictor (besides the constant). Some of the sort of model-building that one does in ordinary regression may be appropriate (e.g., transforming the predictor to promote linearity in the log-odds scale), but whether the distribution of the patient characteristic resembles a normal distribution is probably not important (I would usually not check).

One approach, discussed by Hosmer and Lemeshow (2000, Section 4.2), fits a simple logistic regression (or the equivalent) for each candidate predictor. Then "any variable whose univariable test has a p-value < 0.25 is a candidate for the multivariable model along with all variables of known clinical importance."

I hope this discussion is helpful.

David Hoaglin

Hosmer DW, Lemeshow S (2000). Applied Logistic Regression, second edition. John Wiley & Sons. (I do not have the more recent third edition.)
Comment
Jon Smith

Join Date: Jun 2014

Posts: 4
#8

04 Jun 2014, 21:55

Thank you for your thoughtful responses.
So if I understand you correctly, since the goal of the analysis is prediction, then I should just use the logistic or exact logistic model p value and skip worrying about p values calculated by other tests?
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#9

05 Jun 2014, 09:18

Yes, I'd use the logistic procedure here, or some other statistical approach that lets you examine how event frequency varies, conditional on your continuous variable. The issue is not quite that "the goal of the analysis is prediction." (I used the language of "prediction" for convenience.) Rather, it is that many (perhaps most) statistical procedures are designed to model a situation in which an outcome variable causally depends on one or more other variables, so you will want to pick a procedure that fits the causal order that makes the most sense for your data. An example like my hung/dead one above would be instructive: Suppose you were interested in whether ingesting quantities of methanol resulted in blindness. You would not compare blind and sighted people to see whether their mean consumption of methanol differed over the last several days. Rather, you'd want to see how the relative frequency of blindness varied across different levels of methanol consumption. The general principle here is: If you want to see whether X causes Y, see whether Y varies as a function of X, not the reverse. The issue here is as much a matter of logic as one of statistical procedure.

Keeping in mind that the issue is "logic:" As a beginner in statistics, why not try something less fancy but logically appropriate? You might categorize your continuous variable into (say) 5 groups, and compare the odds on "event" by a simple crosstabulation with this categorized variable. The logistic model is just a fancier way of seeing the same thing that such an analysis would reveal.

Regards,
Mike
Comment
Paul T Seed

Join Date: Apr 2014

Posts: 66
#10

06 Jun 2014, 15:39

The problem is likely to line in the non-normal distributions. For Normally distributed data, the t-test is generally more powerful than the correponding rank sum data (after all, you sare throwing data away when you use ranks instead of actual values); and approximately as powerful as the logistic regression.

To get the power back, you need to transform.for normality. Commands such as -ladder- -gladder-, -qnorm- and boxccox- will help you to find out what transformation to use.
Comment

Announcement

Question on how Stata calculates p values in logistic regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment