Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Skip Patterns and Binary Variables

    Hello everyone,

    I am currently working with a data set based off of a questionnaire. I would like to make my race variable a binary variable, (1) white versus nonwhite(0). However, the variable has a skip pattern and entries for "refused," "don't know," and missing data. I'm not too sure what's the best way of going about coding the binary variable. I appreciate any suggestions!

    race

    sex | Freq. Percent Cum.
    ------------+-----------------------------------
    skipped | 914 34.05 34.05
    man | 772 28.76 62.82
    woman | 647 24.11 86.92
    refused | 11 0.41 87.33
    don't know | 2 0.07 87.41
    missing | 338 12.59 100.00
    ------------+-----------------------------------
    Total | 2,684 100.00


  • #2
    Showing the tabulation results for a different variable from the one you are asking about isn't really helpful. It gives a few hints about how the data is organized, but not enough to go on.

    Please show an example of your data, using the -dataex- command. (If you are not using Stata version 15.1, run -ssc install dataex- to install the command.) Read -help dataex- for instructions. If you do that, you are likely to get a timely and helpful response.

    Comment


    • #3
      I apologize for not double checking my question. After running dataex, here is the information concerning my independent variable of interest. With the missing values and skip patterns, I'm not sure how to create a binary variable: (1) white versus (0) nonwhite. Thanks!
      input int fhlprace
      1
      2
      9
      2
      9
      2
      9
      2
      9
      2
      0
      9
      1
      2
      9
      9
      9
      9
      9
      9
      9
      2
      9
      2
      2
      2
      2
      9
      9
      2
      2
      9
      9
      9
      1
      9
      2
      9
      9
      9
      9
      0
      1
      9
      2
      2
      0
      9
      0
      0
      9
      0
      9
      0
      9
      9
      0
      9
      2
      9
      9
      9
      2
      9
      2
      2
      1
      2
      9
      9
      2
      2
      2
      2
      2
      9
      2
      2
      2
      9
      2
      9
      2
      1
      2
      9
      9
      2
      2
      9
      2
      9
      1
      9
      9
      9
      1
      0
      1
      1
      end
      label values fhlprace fhlprace
      label def fhlprace 0 "skipped", modify
      label def fhlprace 1 "white, not hispanic", modify
      label def fhlprace 2 "black, not hispanic", modify
      label def fhlprace 9 "missing", modify

      Comment


      • #4
        Code:
        gen is_white = 1.fhlprace
        is a start. I'm assuming that by "white" you mean "white, non hispanic." (If not, your question is ill-posed because a hispanic white would have no valid response to this question and it would probably end up coded as 9 "missing," which you would be unable to distinguish from other causes of "missing" response.

        Now, there is also the question of how you want to handle those who responded "skipped" or "missing." The code shown above codes them as 0. But if you prefer to code them as missing, add the following:

        Code:
        replace is_white = . if inlist(fhlprace, 0, 9)
        Finally, if you are going to use the variable fhlprace for later analyses, you should undo the encoding of "skipped" and "missing as 0 and 9: those are errors waiting to happen. Unlike some other software packages, Stata has no provisions for treating certain numbers as codes for missing values. You have to actually use a missing value, or Stata will treat the 0 and 9 as real numbers and mess up calculations and tabulations as a result. So you will want to change those to Stata missing values. See -help mvdecode- for the various possibilities Stata affords you.

        Comment


        • #5
          Thank you very much, Clyde. This is extremely helpful. Skip patterns/logic are tough to get around.

          Best,
          Tom

          Comment

          Working...
          X