Rescaling variables before or after reshaping a dataset--

Sule Yaylaci

Join Date: Jan 2018

Posts: 52
#1

Rescaling variables before or after reshaping a dataset--

18 Jun 2018, 19:07

Hi, Statalist users.
I use Stata 14 SE and am working with a number of continuous variables measured across two waves from the same sample of individuals. I use mixed models and hence I often reshape my dataset into a long form.
The variables I use in my model (DV) are on different scales, and I wanted to rescale them all to range from 0 to 1.
I noticed that if I rescale before I reshape and if I rescale after I reshape, the model results differ, which I found quite puzzling.
The correlation of the original variables with their rescaled version ceases to be 1 in the long form.
I wonder if any of you may explain why that may be happening.
Given the difference in model outputs, which output should I trust?
I am leaning towards rescaling before reshaping.

This is the code I use to rescale:

Code:

foreach var of varlist x_w1 x_w2 y_w1 y_w2 z_w1 z_w3 { egen min`var'=min(`var') egen max`var'=max(`var') gen `var'_01= ((`var'-min`var')/(max`var'-min`var'))
Tags: None
Jorrit Gosens

Join Date: Jan 2015

Posts: 1019
#2

19 Jun 2018, 00:31

You should rescale your data when in long form.
When you have your data wide the data get rescaled for each of your variables x_w1 y_w1, which are a subset of the long w1 variable. When wide, rescaling happens with the highest vale in each set to 1, when long, resacling happens with the highest vale in the combined set of values set to 1
Comment
Sule Yaylaci

Join Date: Jan 2018

Posts: 52
#3

19 Jun 2018, 06:30

Thanks much for your answer, Jorrit. Very useful!
Comment

Announcement

Rescaling variables before or after reshaping a dataset--

Comment

Comment