Interval Scaled Independent Variables

Christopher Weber

Join Date: Nov 2021

Posts: 42
#1

Interval Scaled Independent Variables

02 May 2022, 05:17

Hello everyone,

I know, this is not a question referring to Stata directly, but maybe someone can help me out nevertheless. Thank you in advance!
I want to conduct a logistic regression, predicting the buying intent with the variables income, age and household size.
In my data however, those three independent variables are (ordinally?) scaled, as follows:

income: 0-1000;10000-20000;20000-30000;30000-40000;40000-50000;50000-60000;60000-70000;70000-80000;80000-90000;90000-100000
age: <20;20-30;30-40;40-50;50-60;60-70;>70
Household size: 1;2;3;4;5;>5

how do I correctly transform these variables so that i can use them in my regression?
My approach was to create a variable for age that ranges from 1-7; a variable for income that ranges from 1-10, etc,, representing each respective interval.

Is this even a correct way to do this? Maybe someone here can help me.

Best regards

Chris
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35239
#2

02 May 2022, 05:48

When data are presented in disjoint intervals, that is not interval scale in the sense of S.S. Stevens and his classification into nominal, ordinal, interval and ratio scales of measurement. That classification is contentious and has proved confusing in many contexts, but here it's a useful prop for spelling out that all three variables you mention are indeed ordinal as presented.

The integer codes you've used seem fine. Make sure that you define value labels too and use factor variable notation in feeding those "independent variables" (I strongly recommend almost any other alternative term, such as predictors) to your chosen model(s).
Comment
Christopher Weber

Join Date: Nov 2021

Posts: 42
#3

02 May 2022, 06:45

Thank you very much for your response, Mr. Cox! So by using factor variable notation you mean the following?

Code:

logit Buyingintent gender i.Age i.Householdsize Livingsituation Backpain i.Income

the other predictors are binary so this should work out without factor variable notation, right?

Best regards

Chris
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35239
#4

02 May 2022, 08:39

I'd still go

Code:

i.gender

Please don't call me Mr. as Nick Cox is fine for all purposes here.
Comment
Christopher Weber

Join Date: Nov 2021

Posts: 42
#5

02 May 2022, 11:59

Okay. Thank you Nick!
Just out of curiosity, wouldn't it also work in my case not to run the variables as dummies, but to treat them like continuous variables? In my understanding, a statement about the effect of increasing age, for example, could still be made in this case. When using factor variables, my sample is significantly reduced by the automatic treatment of the "dummy variable trap".

Best regards

Chris
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35239
#6

02 May 2022, 14:17

That’s a quite different model.

You don’t reduce the sample size by specifying any predictor through indicator variables.
Comment

Announcement

Interval Scaled Independent Variables

Comment

Comment

Comment

Comment

Comment