Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interval Scaled Independent Variables

    Hello everyone,

    I know, this is not a question referring to Stata directly, but maybe someone can help me out nevertheless. Thank you in advance!
    I want to conduct a logistic regression, predicting the buying intent with the variables income, age and household size.
    In my data however, those three independent variables are (ordinally?) scaled, as follows:

    income: 0-1000;10000-20000;20000-30000;30000-40000;40000-50000;50000-60000;60000-70000;70000-80000;80000-90000;90000-100000
    age: <20;20-30;30-40;40-50;50-60;60-70;>70
    Household size: 1;2;3;4;5;>5

    how do I correctly transform these variables so that i can use them in my regression?
    My approach was to create a variable for age that ranges from 1-7; a variable for income that ranges from 1-10, etc,, representing each respective interval.

    Is this even a correct way to do this? Maybe someone here can help me.

    Best regards

    Chris

  • #2
    When data are presented in disjoint intervals, that is not interval scale in the sense of S.S. Stevens and his classification into nominal, ordinal, interval and ratio scales of measurement. That classification is contentious and has proved confusing in many contexts, but here it's a useful prop for spelling out that all three variables you mention are indeed ordinal as presented.

    The integer codes you've used seem fine. Make sure that you define value labels too and use factor variable notation in feeding those "independent variables" (I strongly recommend almost any other alternative term, such as predictors) to your chosen model(s).

    Comment


    • #3
      Thank you very much for your response, Mr. Cox! So by using factor variable notation you mean the following?
      Code:
      logit Buyingintent gender i.Age i.Householdsize Livingsituation Backpain i.Income
      the other predictors are binary so this should work out without factor variable notation, right?

      Best regards

      Chris

      Comment


      • #4
        I'd still go
        Code:
        i.gender
        Please don't call me Mr. as Nick Cox is fine for all purposes here.

        Comment


        • #5
          Okay. Thank you Nick!
          Just out of curiosity, wouldn't it also work in my case not to run the variables as dummies, but to treat them like continuous variables? In my understanding, a statement about the effect of increasing age, for example, could still be made in this case. When using factor variables, my sample is significantly reduced by the automatic treatment of the "dummy variable trap".

          Best regards

          Chris

          Comment


          • #6
            That’s a quite different model.

            You don’t reduce the sample size by specifying any predictor through indicator variables.

            Comment

            Working...
            X