Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • recoding variables and the subsequent effect on my data

    I have appended 5 waves of data into one dataset. I have already created dummy variables previously in each wave separately. One problem I have is that in the first 3 waves the answer options were different to those in the latter 2 waves. This is best explained by an example:

    Respondents were asked to give an assessment of their health and in the first 3 waves the options were (with code in parenthesis); excellent(1), good(2), fair(3) and poor(4), whereas in the latter two waves the options were excellent(1) very good(2) good(3) fair(4) and poor(5). In the latter two waves I changed those who answered very good(2) to good(3), I realise that this may not bode well and I welcome any better suggestions as to what to do. However my problem now is that in my final data set in the health variable column some 3s represent fair (in waves 0-3) and some represent good (in waves 4 and 5) the same problem applies to 4s with some meaning poor and some fair and I have an extra 5. As the dummy variables were sorted prior to appending the data the dummies are correct for each individual. Is it necessary to recode the health variable so everything matches? (I imagine it is). I assume that I use if wave=4: recode health 3=2 (etc). Will doing this effect the dummy variables in any way? (I imagine it wont).

    A similar problem has occurred in my Socio Economic Group variable however this is more easily fixed as the problem is that if the individual is in group 4, some have the code 4 and some have 40 (the same applies to 5-13=50/130) This is easily fixed (I think) but again I was wondering if this would have any baring on my dummies by changing them as a result of my recoding?

    I apologise for the various sub questions within this question and would welcome any advice.
    (I am working on my undergraduate dissertation)

    Thank you

  • #2
    If I understand your questions, they're as follows:

    1. The response pattern for "health" changed from a 4-category to a 5-category Likert scale between the third and fourth wave. I am going to analyze my data by treating "health" as a 4-category Likert for all 5 waves, by collapsing "very good" and "good" into the same category. Are there better alternatives?
    2. I collapsed the original "health" variable and now my numeric categories are ambiguous. Can I salvage this using the "wave" variable?
    3. Will this affect previously generated dummy variables?
    4. My socioeconomic group variable has some trailing zeroes I need to get rid of.
    5. Will this affect previously generated dummy variables?

    I am going to skip your first question because I do not know enough about your data to answer it in anything but the most approximate sense (i.e. you are going to lose some information that way, you can't be completely sure that your two scales mean the same thing to your respondents even though they appear much the same, but if you want to compare across waves you're going to have to deal with it somehow).

    2. Yes. (I would suggest taking this as a lesson to generate new variables rather than modifying original ones, until you're pretty sure you've got your dataset where you want it.)
    Code:
    recode health (3=2 4=3 5=4) if (wave>3&wave<.), gen(newhlth)
    3 and 5. Stata is not Excel; if you generate a new variable it will not update itself automatically as you update its parent variable(s). Stata can generate dummy variables automatically from categorical variables, using the -xi- command. These variables do not update themselves, either, although if you use -xi:- as a prefix to a regression command, and specify i.varname in the model, it will generate new dummy variables every time you specify a new regression with the prefix and i.myvar (so if you change the values of myvar between xi regressions, Stata will overwrite existing i.myvar values the next time you specify an xi: regression using i.myvar.) Have a look through Stata's helpfiles for "xi" and "fvvarlist" for more detail on this exception.

    4. After you confirm that all your "10"s are really "1"s, see here: http://www.stata.com/support/faqs/da...railing-zeros/

    Hope that helps.
    Best regards,
    Jen Marino
    Research Fellow
    Gynaecology Research Centre
    University of Melbourne and Royal Women's Hospital

    Comment

    Working...
    X