I have appended 5 waves of data into one dataset. I have already created dummy variables previously in each wave separately. One problem I have is that in the first 3 waves the answer options were different to those in the latter 2 waves. This is best explained by an example:
Respondents were asked to give an assessment of their health and in the first 3 waves the options were (with code in parenthesis); excellent(1), good(2), fair(3) and poor(4), whereas in the latter two waves the options were excellent(1) very good(2) good(3) fair(4) and poor(5). In the latter two waves I changed those who answered very good(2) to good(3), I realise that this may not bode well and I welcome any better suggestions as to what to do. However my problem now is that in my final data set in the health variable column some 3s represent fair (in waves 0-3) and some represent good (in waves 4 and 5) the same problem applies to 4s with some meaning poor and some fair and I have an extra 5. As the dummy variables were sorted prior to appending the data the dummies are correct for each individual. Is it necessary to recode the health variable so everything matches? (I imagine it is). I assume that I use if wave=4: recode health 3=2 (etc). Will doing this effect the dummy variables in any way? (I imagine it wont).
A similar problem has occurred in my Socio Economic Group variable however this is more easily fixed as the problem is that if the individual is in group 4, some have the code 4 and some have 40 (the same applies to 5-13=50/130) This is easily fixed (I think) but again I was wondering if this would have any baring on my dummies by changing them as a result of my recoding?
I apologise for the various sub questions within this question and would welcome any advice.
(I am working on my undergraduate dissertation)
Thank you
Respondents were asked to give an assessment of their health and in the first 3 waves the options were (with code in parenthesis); excellent(1), good(2), fair(3) and poor(4), whereas in the latter two waves the options were excellent(1) very good(2) good(3) fair(4) and poor(5). In the latter two waves I changed those who answered very good(2) to good(3), I realise that this may not bode well and I welcome any better suggestions as to what to do. However my problem now is that in my final data set in the health variable column some 3s represent fair (in waves 0-3) and some represent good (in waves 4 and 5) the same problem applies to 4s with some meaning poor and some fair and I have an extra 5. As the dummy variables were sorted prior to appending the data the dummies are correct for each individual. Is it necessary to recode the health variable so everything matches? (I imagine it is). I assume that I use if wave=4: recode health 3=2 (etc). Will doing this effect the dummy variables in any way? (I imagine it wont).
A similar problem has occurred in my Socio Economic Group variable however this is more easily fixed as the problem is that if the individual is in group 4, some have the code 4 and some have 40 (the same applies to 5-13=50/130) This is easily fixed (I think) but again I was wondering if this would have any baring on my dummies by changing them as a result of my recoding?
I apologise for the various sub questions within this question and would welcome any advice.
(I am working on my undergraduate dissertation)
Thank you
Comment