Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple conditions within loop

    Hello, Stata Users!

    My question might seem too simple, but I am struggling with generating new variables subject to 2 conditions. When I ran the following code, I am getting variables that satisfy only the first condition, i.e. stata is not giving the value of 0 for the second condition, but I need both conditions to be met:


    foreach i of numlist 2/11 {
    if b4_`i'==2 & b4_`i-1'==2 {
    gen sis_bro`i'=1}
    else if b4_`i'==2 & b4_`i-1'==1 {
    replace sis_bro`i'=0}}
    }

    PS: I am generating new dummy variables within families where 1=female having a sister; 0=female having a brother. I already have b4_* variables indicating the gender of kids within families.

    Would appreciate any help!

  • #2
    You are confusing the -if- command (which you are, incorrectly, using) with the -if- condition (which is what you actually need here.) This is a natural instinct for people who are experienced with other programming languages that do not have this distinction. The code you need is:

    Code:
    foreach i of numlist 2/11 {
        gen sis_bro`i'=1 if b4_`i'==2 & b4_`=`i'-1'==2
        replace sis_bro`i'=0 if b4_`i'==2 & `=`i'-1'==1
    }
    The -if- condition, shown in my code here, is used to condition the execution of a command on relationships among the variables in the data. The -if- command which you used in #1 is used to condition execution of a block of commands on a global condition of the Stata environment. In particular, in an -if- command you cannot refer to variables as a whole. Since b4_`i' is a whole variable, and cannot be interpreted as such in the -if- command, Stata defaults to interpreting the reference to b4_`i' as the value of b4_`i' in the first observation. So what your code did is determine whether or not b4_`i'[1] == 2 & b4_`=`i'-1'[1] == 2, and then applied the resulting decision to the entire data set.

    Read -help ifcmd- for more information.

    Comment


    • #3
      Dear Clyde,

      Thank you for your reply and clarification of the distinction between -if command and condition.
      Below, I have attached how the data look like. In photo 1), there is a set of variables b4_*, sex of kids within a family in ascending order, i.e. it starts with the youngest kid. What I want is to generate a set of corresponding variables for female kids who have next-youngest sister and next-youngest brother. The code you posted above yielded the result depicted in photo 2). I am sure I was wrong with my brackets or whatsover above, but I have been struggling what my mistake exactly is. Ideally, newly generated variables whould be missing if a kid is male himself, 1 if female kid has a female sibling, 0 if female kid has a male sibling next. So ,there will be no values for the youngest kids, but they start from b4_2 as (example from first row) sis_bro2 =0 (female kid has a male sibling), . (kid is male) , 0 (female kid has a male sibling) and so forth.
      .









      Attached Files
      Last edited by Farogat WIUT; 18 Nov 2020, 03:54.

      Comment


      • #4
        Hello All,

        Please could you help me?

        Could you help me identify what's not working in the code below, please? What I'm trying to do is find the second value after the minimum in a set of 7 variables, while also considering conditions regarding other variables. When executing this code, it often happens that the two variables created have the same values, and the condition that the second variable should be greater than the first is not met.

        Here is the code:

        gen Var1 =.y
        forvalues i=1/7 {
        replace Var1 = ACHB`i'C if ACHB`i'C < Var1 & ~missing(ACHB`i'C) & RCI30_`i'== 1
        }

        gen Var2 =.y
        forvalues i=1/7 {
        replace Var2 = ACHB`i'C if (ACHB`i'C > Var1 & ACHB`i'C < Var2) & ~missing(ACHB`i'C) & (RCI30_`i'== 1)
        }

        Thank you for your assistance.

        Comment


        • #5
          There is nothing obviously wrong with this code. I suspect the problem is with your data. I tried the code after constructing a toy data set that contains variables that might be similar to what you have. Even with 250,000 observations, the only instances where Var1 < Var2 did not apply was in those observations were all 7 values of RCI30_* are 0--these are cases where no value of Var1 or Var2 ever gets selected, and both variables are equal, valued at .y.

          Code:
          . clear*
          
          . set obs 250000
          Number of observations (_N) was 0, now 250,000.
          
          . set seed 1234
          
          . forvalues i = 1/7 {
            2.         gen ACHB`i'C = runiformint(0, 20)
            3.         gen RCI30_`i' = runiformint(0, 1)
            4. }
          
          .
          .
          .
          . gen Var1 =.y
          (250,000 missing values generated)
          
          . forvalues i=1/7 {
            2. replace Var1 = ACHB`i'C if ACHB`i'C < Var1 & ~missing(ACHB`i'C) & RCI30_`i'== 1
            3. }
          (124,728 real changes made)
          (92,607 real changes made)
          (70,366 real changes made)
          (55,910 real changes made)
          (45,701 real changes made)
          (38,168 real changes made)
          (32,547 real changes made)
          
          .
          . gen Var2 =.y
          (250,000 missing values generated)
          
          . forvalues i=1/7 {
            2. replace Var2 = ACHB`i'C if (ACHB`i'C > Var1 & ACHB`i'C < Var2) & ~missing(ACHB`i'C) & (RCI30_`i'== 1)
            3. }
          (86,349 real changes made)
          (70,628 real changes made)
          (57,534 real changes made)
          (48,508 real changes made)
          (40,969 real changes made)
          (35,252 real changes made)
          (30,678 real changes made)
          
          .
          . egen max_RCI30 = rowmax(RCI30*)
          
          . assert max_RCI30 == 0 if !(Var1 < Var2)
          
          .
          So, either these are the only cases you are generating where Var1 < Var2 fails--and I don't know what you would want to do about that as it seems really correct as is.

          Or, perhaps you are using floating point values for the ACHB*C variables, rather than integer values as in my example. In that situation, it may be that the lowest and second lowest values of the ACHB*C variables (restricting to corresponding RCI30 == 1) are almost exactly the same, differing in far out decimal places. And so Var1 < Var2 actually is true, but because you are looking at the data in a -list- or -browse- context you may be seeing smaller number of decimal places that are not showing the difference, even though it is there.

          All of that said, I wouldn't use this approach to the problem in the first place. Here's how I would do it:

          Code:
          //    PREPARE A NEW FRAME FOR CALCULATION
          gen `c(obs_t)' obs_no = _n
          frame put obs_no ACHB*C RCI30_*, into(working)
          
          frame change working
          reshape long ACHB@C RCI30_, i(obs_no)
          drop _j    // NOT NEEDED
          drop if RCI30_ != 1    // EXCLUDED FROM SEARCH FOR VAR1, VAR2
          duplicates drop // ELIMINATE TIED VALUES OF ACHBC
          
          by obs_no (ACHBC), sort: gen Var1 = ACHBC[1] // LOWEST ACHBC VALUE
          by obs_no (ACHBC): gen Var2 = ACHBC[2] // SECOND LOWEST ACHBC VALUE
          by obs_no (ACHBC): keep if _n == 1
          
          
          //    RETRIEVE THE RESULTS
          frame change default
          frlink 1:1 obs_no, frame(working)
          frget Var*, from(working)
          drop working
          frame drop working
          
          assert Var1 < Var2 if !missing(Var1, Var2)
          Added: If my advice here does not resolve your problem, please be sure to post back with example data that demonstrates the difficulties you are having. Use the -dataex- command to do that. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
          Last edited by Clyde Schechter; 10 Apr 2024, 12:16.

          Comment


          • #6
            Thank your, Mr Schechter for your valuable help. I verified in my data set and I can see where Var1=Var2, all the 7 values of RCI30_* are not necessary egal to 0 or missing

            Please, can you see some observations of my dataset:

            ACHB1C ACHB2C ACHB3C ACHB4C ACHB5C ACHB6C ACHB7C RCI30_1 RCI30_2 RCI30_3 RCI30_4 RCI30_5 RCI30_6 RCI30_7
            23,4 27,6 33,4 Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Enfant biologique Sans objet Sans objet Sans objet Sans objet
            23,3 26,3 30,4 Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Enfant biologique Sans objet Sans objet Sans objet Sans objet
            22,9 25,2 Sans objet Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Sans objet Sans objet Sans objet Sans objet Sans objet
            18 ans ou moins 19,6 23,6 28,8 Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Enfant biologique Enfant biologique Sans objet Sans objet Sans objet
            30,4 Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Enfant biologique Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet
            Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Enfant par alliance Enfant par alliance Sans objet Sans objet Sans objet Sans objet Sans objet
            Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Enfant par alliance Enfant par alliance Enfant par alliance Enfant par alliance Sans objet Sans objet Sans objet
            24,4 27,4 28,5 Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Enfant biologique Enfant par alliance Sans objet Sans objet Sans objet
            24,8 27,1 Sans objet Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Sans objet Sans objet Sans objet Sans objet Sans objet
            Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet
            25,1 26,9 28,3 Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Enfant biologique Sans objet Sans objet Sans objet Sans objet
            Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet
            22,1 23,8 24,7 26,4 Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Enfant biologique Enfant biologique Sans objet Sans objet Sans objet
            Ne connaît rien au sujet de l’enfant Ne connaît rien au sujet de l’enfant Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet
            22,3 Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Enfant biologique Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet
            25,5 24,8 Sans objet Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Sans objet Sans objet Sans objet Sans objet Sans objet
            Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet
            29,1 31,6 Sans objet Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Sans objet Sans objet Sans objet Sans objet Sans objet
            28,2 32,4 Sans objet Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Sans objet Sans objet Sans objet Sans objet Sans objet
            Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet
            Sans objet Sans objet 29,9 Sans objet Sans objet Sans objet Sans objet Enfant par alliance Enfant par alliance Enfant biologique Sans objet Sans objet Sans objet Sans objet
            34,1 35,4 38,8 Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Enfant biologique Sans objet Sans objet Sans objet Sans objet
            25,7 30,3 Sans objet Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Sans objet Sans objet Sans objet Sans objet Sans objet
            28,3 32,6 Sans objet Sans objet Sans objet Sans objet Sans objet Enfant biologique Enfant biologique Sans objet Sans objet Sans objet Sans objet Sans objet
            29,8 Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet Enfant biologique Sans objet Sans objet Sans objet Sans objet Sans objet Sans objet

            Comment


            • #7
              Sorry, I'll paste the data again

              Comment


              • #8
                AGEparite is same as Var
                Attached Files

                Comment


                • #9
                  Also, I inform you that ACHB*C variables represent maternity age, so there is no way that they are almost exactly the same except for twins birth, but we don,t have for those cases.

                  Comment


                  • #10
                    Sorry if my english is not so good, I'm not an English speaker.

                    Comment

                    Working...
                    X