Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Right skewed 5-point Likert item: Transform or median split?

    Hello,

    I am interested in your opinion regarding the following:
    I have a 5-point Likert item which is left skewed. Using ladder and gladder indicates a cubic transformation as the prime choice.
    Alternatively, it is possible to perform a median split in order to dichotomize the variable into 0/1 .
    The final variable will be used as a dependent variable within a regression model.

    The question that I have here is which procedure would be "better"?
    Transforming the variable makes it a bit uncomfortable (for me) to interpret in a regression model.
    However, performing a median split to run a logistic regression model appears to me a bit troublesome:
    Here, I am wondering what you would do with hose values that represent the median? Would they be included
    into the = or 1 category? Or are there other ways to deal with it?

    I am thankful for your opinions.

    Andreas

  • #2
    Have you considered ordered logistic regression as an alternative both of the above?

    Also, how is the kurtosis? If that's not horrendous, how about just using the values as-is in your linear regression? My understanding is that skew is not your worst enemy.

    Coarsening of data into two categories doesn't seem as if it would usually benefit interpretation, although I've seen a visual analog scale where all the markings ended up at either one end or the other.

    Comment


    • #3
      I commend Joseph's advice and would go a little further. First, splitting at the data at the median is just throwing away information, and doing that arbitrarily.

      A transform of such an ordered scale is in my view essentially pointless if not meaningless. The data appear skewed when the grades are taken literally, no doubt, but no method I know of for ordered (ordinal, graded) responses assumes even symmetric distributions.

      Conversely, correspondence analysis is one technique for letting the data indicate what metric scores are appropriate with a definite rationale. Otherwise we have had techniques tailor-made for such data for about 50 years, so they commend themselves.

      Comment


      • #4
        I agree with Joseph that ordered logistic regression may work. I also share Nick's reluctance to "just throw away information." Such data are often expensive to collect, and it's destructive to remove it if we can model it.

        I would test for the proportional odds assumption, which is often violated in the literature when using ologit. The SSC package omodel provides such a test of the proportional odds assumption. I have no basis to say this except personal experience and suspicion when a Likert-item is highly, skewed, but I'd venture that this assumption is violated.

        Two options in the Stata enviroment is the gologit2 package by Richard Williams and the slogit (stereotyped logistic regression), which is mentioned briefly in Freese and Long's book on categorical data analysis (2nd edition). I recommend that highly for understanidng some of these issues that are often violated in practice (link to Stata Press of most recent edition of Freese & Long). The stlogit procedure assumpes the Likert item bins (as a histogram) a continuous underlying, latent distribution, so keep that in mind.

        Again, it's a hunch, but I'm willing to bet ologit assumptions won't be met, so you may try the other options.

        A final thought (try to make a scale): Likert-items are often used in scales. Why not construct a scale that is a better measure of the underlying construct you are trying to measure? If that's possible, I'd go with that. In this latter case, you're adding information to contribute to the outcome, rather than throwing it away. Far preferable in my book!

        - Nate
        Nathan E. Fosse, PhD
        [email protected]

        Comment


        • #5
          I would go a step further and say that skewness is not defined for your data. While it does not directly mention this in: http://www.itl.nist.gov/div898/handb...on3/eda35b.htm, there is an implicit assumption that the data are measured on at least an intervallic scale. Because you cannot assume that the distance between the integers is equal, you can't truly quantify the difference in values (e.g., 3 - 2 may not be the same as 2 - 1). If you only have a single item, it may be worth it to consider creating a phantom latent (Rindskopf, 1984) for the dependent variable and regressing that on your right hand side variables. Interpretation can become a bit easier since you are essentially forcing a representation of that variable into an N(0, 1) context. If that isn't palatable and you have more items, use the IRT capabilities in the new Stata to construct a measure from those items and then use that derived scaled score. And if that doesn't work ologit and/or gologit2 (available from SSC) would be my next choices. With the ordinal logistic models you also need to consider assumptions about proportional odds and whether or not the assumption is reasonable in your context so you can use the appropriate ordinal logistic model.

          Rindskopf, D. (1984). Using phantom and imaginary latent variables to parameterize constraints in linear structural models. Psychometrika, 49(1), pp 37-47.

          Comment


          • #6
            thanks everyone for your helpful contributions. What became clear is that median splits are an essentially bad idea. I will probably end up using ologit together with omodel.

            Comment

            Working...
            X