Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pool binary Variables

    Hey there,

    i have the following problem: I have 5 Dummy Variables (1/0) which i want to include in only one new Variable. So my new variable should show all the observations as a 1 if one of the Dummy Variables is equal to one. What I've done so far is that i generated a new Variable which includes all the other Variables if their value is equal to 1. The problem im faced with now is that if for example two of the five Dummy Variables are equal to one, my new variable only considers one of the oberservations to be one. The result is that the number of observations equal to 1 for the new Variable are less than the total number of all observations equal to one if you add the five dummy variables with your calculator.

    Here is the Code i used so far:

    Code:
    [gen NEWVAR = VAR1 ==1 | VAR2 ==1 | VAR3 ==1 | VAR4 ==1 | VAR5 ==1] This may be not that hard to solve but as a greenhorn im not quite sure how to deal with this problem. Thank you for your answer !
    Best regards

  • #2
    I find your question very unclear but here is a guess:
    Code:
    gen byte NEWVAR=max(VAR1,VAR2,VAR3,VAR4,VAR5)==1
    note that this will ignore missing values unless all of var1-var5 are missing

    if this is not what you want, please read the FAQ and clarify what you are looking for (e.g., example data with the desired output - all in code blocks)

    Comment


    • #3
      Hi Leo,

      Yes, if you want a "new variable should show all the observations as a 1 if one of the Dummy Variables is equal to one", then your new variable will be an indicator variable (usually coded as 1/0). Your new variable does not "consider one of the observations to be one", it just take one of your dummy variables to be == 1 to your new variable be == 1.

      If knowing how many of your dummies are == 1 is important, I suggest doing a code such as:

      Code:
      tab1 VAR1 VAR2 VAR3 VAR4 VAR5, m
      *check the 5 tables, all VAR* should only be 0/1 and no missing values (.) should be present
      
      gen amountof1 = VAR1 + VAR 2 + VAR3 + VAR4 + VAR5
      gen newvar = 1 if amountof1 >=1
      replace newvar = 0 if amountof1 == 0
      This creates a new var (amountof1) that sums all VAR*, showing how many of them were 1. This will only work properly if VAR* does not have missing values and are 0/1. It also creates your "newvar" and assign 1 to if if at least one among VAR* is ==1, and zero if none is ==1.

      Comment

      Working...
      X