Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating a new binary dependent variable from a categorial variable

    I'm trying to generate a new binary variable for my DV. Using the GSS2010 data. But I can't seem to figure out how to get 1 for exciting and 0 for routine or dull. The variable is "life" and it's outcomes are "exciting," "routine," and "dull." I want to generate a new variable called "excitinglife" where "exciting" = 1, and "routine" or "dull" = 0. I also want to exclude missing values. But I've run into many error messages, particularly, the type mismatch on the life variable. What commands should I be using to create this dummy variable?

  • #2
    For more context, here's what I tried most recently, and the errors I'm getting:

    ​​​​​​. tab life

    is life exciting or dull | Freq. Percent Cum.
    ------------------------------+-----------------------------------
    exciting | 639 50.47 50.47
    routine | 561 44.31 94.79
    dull | 66 5.21 100.00
    ------------------------------+-----------------------------------
    Total | 1,266 100.00

    . gen excitinglife = 1 if life==exciting if !missing(life)
    exciting not found
    r(111);

    . gen excitinglife = 1 if (life==exciting) if !missing(life)
    exciting not found
    r(111);

    . gen excitinglife = 1 if (life=="exciting") if !missing(life)
    type mismatch
    r(109);

    Comment


    • #3
      On the assumption that your variable life is a string variable:
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str8 life
      "exciting"
      "routine"
      "dull"
      ""  
      end
      
      gen byte wanted = (life == "exciting") if !missing(life)
      Note: If life is in fact a value-labeled numeric variable, then this code will fail. To avoid guesswork leading to both of us wasting our time, in the future please show example data when requesting help with code. And always use the -dataex- command to do that, as I have here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      Added: Crossed with #2. Given the failure of the third line of code she tried, I can conclude that variable life is, in fact, not a string variable in her data set. But, as we still have no example data, we don't know what the value label attached to it is. She can find that out by running -des life-. But, it is possible, though more complicated, to solve this even without that information.
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input byte life
      2
      3
      1
      .
      end
      label values life life
      label def life 1 "dull", modify
      label def life 2 "exciting", modify
      label def life 3 "routine", modify
      
      gen byte wanted = (life == "exciting":`:val label life') if !missing(life)
      This handles the problem where life is a value labeled numeric variable, in the case where the name of the value label is unknown.

      Of course, since Jennifer herself can find out the name of the value label on variable life, by running -des life-, she can do it a bit more simply. If her value label, as in my example, is named life:
      Code:
      gen byte wanted = (life == "exciting":life) if !missing(life)
      Remember that what appears after the colon should be the name of the value label, not the name of the variable. In this example those are the same, as is often the case in practice. But they can be different, and when they are, the value label name, not the variable name, must follow the colon.
      Last edited by Clyde Schechter; 21 Apr 2024, 12:11.

      Comment

      Working...
      X