Possible justifications for treating an ordinal scale variable as a continuous?

Cassie Wright

Join Date: Dec 2021

Posts: 44
#1

Possible justifications for treating an ordinal scale variable as a continuous?

22 Dec 2021, 12:28

Hello!

I have quite a large data set which was cultivated by the British Electorate Survey in 2017. My dependent variables and independent variables I am looking at have N ≈ 2100.

I want to look at the relationship between a few IV's and DV, and whether (if I'm lucky) there is any casual relationship. I'm doing some bivariate analyses and multiple regression with my DV and IV to see this.

Example variables:

DV: Attitudes towards economy vs environment, 10 point ordinal scale.
IV 1: Age (18-100) ratio variable.
IV 2: Education level, 5 point ordinal scale.
IV 3: Twitter user, dummy variable.

I've read Richard Williams interesting article on 'Ordinal Independent Variables' and I was wondering, can I apply this logic to my dependent ordinal variable? Furthermore, how could I go about justifying treating ordinal variables as a continuous? I thought about perhaps doing two separate sections where I say: here I shall treat my DV as a continuous variable and vice versa, but I'm not completely sure on how to explain my reasoning.

I've been looking on this website for anyone in a similar boat to me, and have heard about ordered logistic regression and pca, but I'm not completely sure on how I would use them.

Lastly, bit of a silly question, but if I do regression between my DV and IV, while treating my DV as categorical, would my code be: regress i.DV IV

Thank you in advance and appreciate any help!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

22 Dec 2021, 13:12

Lastly, bit of a silly question, but if I do regression between my DV and IV, while treating my DV as categorical, would my code be: regress i.DV IV

No, Stata does not allow factor-variable notation in the DV of -regress-. To treat the DV as categorical you cannot use -regress-. You must use either -mlogit- or -ologit- (or some other command that fits regression models to categorical or ordinal variables.)

Furthermore, how could I go about justifying treating ordinal variables as a continuous?

This is actually a subtle question. When you use a command that treats your ordinal variable as an interval-level variable you are implicitly making the claim that the 10 points in the response set are equally spaced from a psychological perspective. As an ordinal variable you claim only that response 5 represents something bigger than response 4, which, in turn is bigger than response 3. But if you treat it as an interval variable you are saying that the difference between response 5 and response 4 is the same as the difference between response 4 and response 3, and that the difference between response 5 and response 3 is twice as much as either of the first two differences (and so on for all response levels). When might this be justified? It depends on how the questions were asked and the response set that was presented.

If, for example, the response set was presented as a line 10cm long, and there were 10 tick marks, equally spaced, along the line, and the ticks were marked with numbers 1 through 10, and the 1 and 10 marks were also given descriptive terms like "Very strongly disagree" and "Very strongly agree", respectively, then you could argue that, at least on its face, this is an interval measurement. A skeptic might argue that you need to validate that claim by doing a study where that measurement is paired with some other measure that is widely accepted as interval-level and show that a straight-line regression is a good fit to the relationship between those. There are some response sets that are widely, though not universally, accepted as "equally spaced." Here, I'm think of Likert and Likert-like response sets such as the 5 point "Strongly Disagree, Disagree, Neither Agree nor Disagree, Agree, Strongly Agree" and the like. Few people will object to treating those as interval responses (although you could get unlucky and have your work reviewed by somebody from the minority that rejects this.) An example of a scale where people would generally not buy treatment as interval-level measurement is one with vague response levels such as "Never, Seldom, Occasionally, Much of the Time, Frequently, Always." Who knows whether those vague pseudo-quantitative phrases are "equally spaced" psychologically or not?

So you have to take a hard look at how these questions were posed and the response set structured. If you have external data validating treatment as interval, then that's great--but that's also very uncommon. Absent external data, it boils down to a judgment of the face validity of the "equally spaced" claim.
Comment
Cassie Wright

Join Date: Dec 2021

Posts: 44
#3

22 Dec 2021, 13:44

Thank you for your response again Dr Schechter, it is much appreciated. It's a very interesting point you have raised, and one I have no considered. I think I will try and first acknowledge this issue, but argue that it is equally spaced, as although the DV is on the scale of 0 'prioritisation for the economy' and 10 'prioritisation of the environment', all the values in between are numerical, rather having labels like a Likert response set. So perhaps the respondents assume that it is on an interval scale? I might just have completely missed the point here so apologies if I have.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#4

22 Dec 2021, 13:51

I don't think you have missed the point at all. But I am confused by your explanation of the response set. I can't discern whether all the numerical responses from 1 through 9 have a descriptive label or were just left plain. The "rather have" clause is unclear to me. If left plain, and if equally spaced physically on the page, then the presentation strongly suggests equal distancing and I think many would find the case for treating it as an interval-level measurement persuasive. If, however, the responses from 1 through 9 also have descriptive labels, then you would have to consider whether the wording of those responses is clear and also suggests equal distancing. If those 1 through 9 responses are vague or are clear but do not "sound" approximately equally spaced, then I think you would be skating on thin ice if you call it interval-level measurement.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10195
#5

22 Dec 2021, 13:52

As a general rule, as the number of categories increases, it becomes easier to justify that the categories are spaced equidistant on the latent scale. As a referee, if the outcome has 10 categories, I would not object that one models it using a linear model.
Comment
Cassie Wright

Join Date: Dec 2021

Posts: 44
#6

22 Dec 2021, 13:55

Originally posted by Clyde Schechter View Post

I don't think you have missed the point at all. But I am confused by your explanation of the response set. I can't discern whether all the numerical responses from 1 through 9 have a descriptive label or were just left plain. The "rather have" clause is unclear to me. If left plain, and if equally spaced physically on the page, then the presentation strongly suggests equal distancing and I think many would find the case for treating it as an interval-level measurement persuasive. If, however, the responses from 1 through 9 also have descriptive labels, then you would have to consider whether the wording of those responses is clear and also suggests equal distancing. If those 1 through 9 responses are vague or are clear but do not "sound" approximately equally spaced, then I think you would be skating on thin ice if you call it interval-level measurement.

Sorry about the confusion, the responses from 1 through 9 do not have a descriptive label and are just left as numbers.
Comment
Cassie Wright

Join Date: Dec 2021

Posts: 44
#7

22 Dec 2021, 14:02

Originally posted by Andrew Musau View Post

As a general rule, as the number of categories increases, it becomes easier to justify that the categories are spaced equidistant on the latent scale. As a referee, if the outcome has 10 categories, I would not object that one models it using a linear model.

Thank you for the advice! I'll make note of that.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#8

23 Dec 2021, 05:50

I think this is one of those discussions in which very different stances are all defensible. The response or outcome variable here is from a strict or purist view ordinal but many researchers would happily if not tentatively model it as if it were interval scale and the check on whether that is wise is mostly whether it produces intelligible results.

There are many situations in which this is done.

For example, at my workplace we grade most student assignments on a percent scale with menu points like 0 5 10 15 20 ... 42 45 48 52 55 58 .. 94 100 with scope to go beyond that in some circumstances. The calibration here is through colleagues second-marking each other so that a piece of work gets 84 if people agree that it deserves 84. We would be hard pushed to prove -- in fact we could not prove -- that the difference between 42 and 45 is exactly the same as that between 62 and 65, but we just ignore that difficulty. A measurement purist would say: This does not qualify as interval scale. (Compared with Fahrenheit or Celsius temperature, it fails.)

Even set-ups in which any percent mark may be given depending on academic judgment or a detailed marking scheme (5 points for Question 1 being right, 20 points for Question 2, whatever) do not qualify as interval scheme for those bitten by Stanley Smith Stevens' scheme of nominal, ordinal, interval and ratio. (I get black looks for calling this a NOIR scheme.)

Downstream of this grading we take means left, right and centre and people have been doing that since whenever. No doubt there are courses in our University somewhere explaining that this is all quite wrong.

In @Clyde Schechter's territory there are, if I understand correctly, many clinical measures that are similar or even more dubious, not least patients reporting their own pain.

Even with our own auto data, consider repair record. .

Hands up if you think this is an ordinal scale, so the only defensible summaries of level are median and mode. The medians for domestic and foreign are 3 and 4 as can be checked from the display.

Now hands up if you think that is a little wasteful of the information in the data and would shut the office door and look at means too.

With Cassie's data I would do both ordinal logit and plain regression and take it from there.
1 like
Comment
Cassie Wright

Join Date: Dec 2021

Posts: 44
#9

23 Dec 2021, 07:36

Originally posted by Nick Cox View Post

I think this is one of those discussions in which very different stances are all defensible. The response or outcome variable here is from a strict or purist view ordinal but many researchers would happily if not tentatively model it as if it were interval scale and the check on whether that is wise is mostly whether it produces intelligible results.

There are many situations in which this is done.

For example, at my workplace we grade most student assignments on a percent scale with menu points like 0 5 10 15 20 ... 42 45 48 52 55 58 .. 94 100 with scope to go beyond that in some circumstances. The calibration here is through colleagues second-marking each other so that a piece of work gets 84 if people agree that it deserves 84. We would be hard pushed to prove -- in fact we could not prove -- that the difference between 42 and 45 is exactly the same as that between 62 and 65, but we just ignore that difficulty. A measurement purist would say: This does not qualify as interval scale. (Compared with Fahrenheit or Celsius temperature, it fails.)

Even set-ups in which any percent mark may be given depending on academic judgment or a detailed marking scheme (5 points for Question 1 being right, 20 points for Question 2, whatever) do not qualify as interval scheme for those bitten by Stanley Smith Stevens' scheme of nominal, ordinal, interval and ratio. (I get black looks for calling this a NOIR scheme.)

Downstream of this grading we take means left, right and centre and people have been doing that since whenever. No doubt there are courses in our University somewhere explaining that this is all quite wrong.

In @Clyde Schechter's territory there are, if I understand correctly, many clinical measures that are similar or even more dubious, not least patients reporting their own pain.

Even with our own auto data, consider repair record. .

[ATTACH=CONFIG]n1642101[/ATTACH]

Hands up if you think this is an ordinal scale, so the only defensible summaries of level are median and mode. The medians for domestic and foreign are 3 and 4 as can be checked from the display.

Now hands up if you think that is a little wasteful of the information in the data and would shut the office door and look at means too.

With Cassie's data I would do both ordinal logit and plain regression and take it from there.

Thank you for the reply Dr Cox! I really appreciate the time you took to reply. I feel very blessed to have brilliant minds help me out on this dilemma!

If I was to do both of those tests, how should I word my justification? Would I comment that while it is an ordinal variable, there is an argument that I could consider it an interval variable as the values between 1-9 are numerical? Furthermore, what would I gain from doing both tests, rather than just one? As in if I found any differences, how would I explain this?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#10

23 Dec 2021, 09:08

In @Clyde Schechter's territory there are, if I understand correctly, many clinical measures that are similar or even more dubious, not least patients reporting their own pain.

Indeed there are. In fact, I have published papers using self-reported pain scores as the outcome variable, and I do treat it as an interval variable in those papers. Now, in these particular studies, the self-reports were obtained using the kind of response set in #2. There was a 10 cm long line, with evenly tick marks at 0 through 10. Zero was labeled as "No pain at all" and 10 as "Worst pain imaginable." There was also a frowny-face at 0 and a smiley-face at 10.

My general attitude towards interval treatment of strictly-speaking ordinal variables is more permissive than it is restrictive, and if that did not come across in my earlier posts, then I apologize for poor communication. But there are several things to consider:

1. If you are writing for publication in a journal, you may face editors or reviewers who take a dim view of this, and you have to be prepared to offer some justification.

2. There are some ordinal variables that really should not be treated as interval. I have seen surveys offering response sets that look like this:
Poor.

Good.

Excellent.

Outstanding.

These terms are somewhat vague to begin with. Moreover, the negative ratings occupy just one position in the response set, whereas three different gradations of the positive are shown. And the difference between 3 and 4 is sufficiently fine-grained that most people would be hard pressed to explain the distinction at all. I think that treating this scale as interval measurement would be misleading. Now, I think most people would also agree that regardless of how it is analyzed, this is just a bad way to pose an evaluation question. And it is probably the case that when response sets are thoughtfully designed, they often end up resembling an interval level measurement.

But there are circumstances where that isn't possible, for example, cancer staging and grading systems, or where for good or not-so-good reasons one chooses to do otherwise. And I cannot recall ever seeing a cancer epidemiology paper in which grade or stage of a tumor was analyzed as an interval variable.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#11

23 Dec 2021, 10:21

Thanks for the thanks in #9. It is harder for me to look ahead about what you might or should be writing. It's not at all clear whether each approach would work well or appear consistent with the other. In fields in which I have published such as geomorphology, hydrology, climatology, Quaternary science and forestry, familiarity with plain regression is typically much greater than with ordinal logit, and the same is true of my own experience. In fields looking at election and voter data, I would expect a more even balance and perhaps even the reverse.

"happily if not tentatively" should perhaps have been "happily but tentatively".

Last edited by Nick Cox; 23 Dec 2021, 10:24.
Comment

Announcement

Possible justifications for treating an ordinal scale variable as a continuous?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment