Normalising a variable to be between 0 and 1

Chris Rooney

Join Date: Apr 2014

Posts: 167
#1

Normalising a variable to be between 0 and 1

27 Oct 2015, 06:19

Hi all,

I have four variables which each have values which range between -2.5 and 2.5.

I would like to alter the values of the variables so they are between 0 and 1.

How do I go about doing that?!
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35696
#2

27 Oct 2015, 06:47

If that is exactly correct, this is simple algebra,

Code:

replace x = (x + 2.5)/5

generalised with a foreach loop.

My instinct is always to leave original data exactly as they come and to create a new variable.

Code:

foreach v in frog toad newt dragon { gen `v'2 = (`v' + 2.5)/5 label var `v'2 "`v' scaled to [0,1]" }
1 like
Comment
Brendan Cox

Join Date: May 2015

Posts: 36
#3

27 Oct 2015, 07:29

Or more generally,

Code:

foreach v of varlist ... { qui summ `v' gen `v'2 = (`v' - r(min)) / (r(max) - r(min)) }

Note that per Nick's comment, the above code assumes the desired scaling is based on the actual minimum and maximum values in the data -- i.e., that you want the data to be rescaled to [0,1] based on observed values, not exogenous definitions of the range. Nevertheless, I hope the more general formula & link are useful.

See https://en.wikipedia.org/wiki/Feature_scaling

Last edited by Brendan Cox; 27 Oct 2015, 08:06.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35696
#4

27 Oct 2015, 07:42

Brendan's method (*) may not be quite the same. For example, the definition of varying from -2.5 to 2.5 may mean that in principle these are the limits. If the extremes in the data are not the same, then the scaling will be different.

(*) We're not related. Or at least, I don't know that we are.
Comment
Chris Rooney

Join Date: Apr 2014

Posts: 167
#5

27 Oct 2015, 07:54

Originally posted by Nick Cox View Post

If that is exactly correct, this is simple algebra,

Code:

replace x = (x + 2.5)/5

generalised with a foreach loop.

My instinct is always to leave original data exactly as they come and to create a new variable.

Code:

foreach v in frog toad newt dragon { gen `v'2 = (`v' + 2.5)/5 label var `v'2 "`v' scaled to [0,1]" }

Thanks. How does one then interpret such values then?
Comment
Brendan Cox

Join Date: May 2015

Posts: 36
#6

27 Oct 2015, 08:03

Originally posted by Nick Cox View Post

Brendan's method (*) may not be quite the same. For example, the definition of varying from -2.5 to 2.5 may mean that in principle these are the limits. If the extremes in the data are not the same, then the scaling will be different.

(*) We're not related. Or at least, I don't know that we are.

Ah, yes, quite so. I will edit the post accordingly, but hopefully the more generalised formula & link prove useful references.

* Yes, an unfortunate coincidence for you. For me, perhaps it lends an extra (and undeserved) 'je ne sais quoi' to my posts.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35696
#7

27 Oct 2015, 08:10

Coxes everywhere (we're good at survival):

We all have little stories. Long ago, I heard a lecture with an explanation of a statistical problem. (Yes, I could see that was a key problem.) And here is the standard device we use to solve it. (That's smart! I bet the other guys were kicking themselves when they saw that.) And this, of course, is the standard Cox model. (No, not me. But it still sounds great: the standard Cox model.)

That is, naturally, Sir David Cox (1924- ), much and rightly honoured, and again we are not related.
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35696
#8

27 Oct 2015, 08:13

Chris:

How does one then interpret such values then?

Not sure what you want here. The variables are now scaled to [0, 1], which is what you asked for.
Comment
Chris Rooney

Join Date: Apr 2014

Posts: 167
#9

27 Oct 2015, 09:32

Originally posted by Nick Cox View Post

Chris:

Not sure what you want here. The variables are now scaled to [0, 1], which is what you asked for.

True, but is there any reference on how to interpret them when I use these variables in a regression, for example. What does an increase of 0.1 in a variable which is now scaled [0, 1] mean? Does it mean a 10% increase in [normalised variable] will result in a x% increase in the dependent variable (if the dependent variable is in percent form) ?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35696
#10

27 Oct 2015, 09:44

Forgetting about other predictors, and even the intercept, focus on y = ... + bx + ... where x varies between 0 and 1. The difference between the predicted value for x = 0 and that for x = 1 is precisely b. That's a change over the entire possible range. For any fraction of the range, multiply down.

That doesn't sound like your interpretation, although I don't understand it (what is x in your notation? the coefficient? the predictor?) .
Comment

Announcement

Normalising a variable to be between 0 and 1

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment