Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • foreach loop ttest, new

    Dear Statalist Community,

    I am having difficulty with the commands for a foreach loop to run multiple t-tests. I keep getting an error that variable var is not found.
    Here is my code:

    global myvars1 balance_ind fullness_ind tif_ind gan_ind earthy_ind sweet_ind ///
    sour_ind bitter_ind mouthfeel_ind others_ind aftertaste_ind balance_grp ///
    fullness_grp tif_grp gan_grp earthy_grp sweet_grp sour_grp bitter_grp ///
    mouthfeel_grp others_grp aftertaste_grp aftertaste5_grp aftertaste10_grp ///
    aftertaste15_grp aftertaste20_grp gan5_grp gan10_grp gan15_grp gan20_grp ///
    aftertaste3_grp aftertaste8_grp aftertaste13_grp aftertaste18_grp aftertaste23_grp ///
    gan3_grp gan8_grp gan13_grp gan18_grp gan23_grp mf3_grp mf8_grp mf13_grp mf18_grp mf23_grp
    display "$myvars1"
    foreach var in $myvars1 {
    ttest var, by(spring)
    matrix means=nullmat(means), r(table)
    }

    I have also run the same command with "" around $myvars1 ("$myvars1"), which resulted in the same error. [variable var not found
    r(111)] I was wondering if anyone had any advice? Thank you!

    Best,

    Hannah

  • #2
    your syntax is incorrect; if you look the help file, you will see that you need to start:
    Code:
    foreach var in earliest $mybars1 {
    etc.

    as an aside, I note that generally using global macros is not a good idea; is there some reason you are using a global instead of a local?

    Comment


    • #3
      Hi Rich, thank you for your response! Unfortunately I still get the same error after correcting the syntax. I am using global so I can run different commands without losing the varlist. Is there a way to do this with local? Thank you!

      Comment


      • #4
        The code has errors in several places. It's not entirely clear which of them is provoking this error message, but when you fix that you will just stumble into the others in turn.

        Let's leave the local-global issue aside for now. Here are some corrections to your code:

        Code:
        global myvars1 balance_ind fullness_ind tif_ind gan_ind earthy_ind sweet_ind ///
        sour_ind bitter_ind mouthfeel_ind others_ind aftertaste_ind balance_grp ///
        fullness_grp tif_grp gan_grp earthy_grp sweet_grp sour_grp bitter_grp ///
        mouthfeel_grp others_grp aftertaste_grp aftertaste5_grp aftertaste10_grp ///
        aftertaste15_grp aftertaste20_grp gan5_grp gan10_grp gan15_grp gan20_grp ///
        aftertaste3_grp aftertaste8_grp aftertaste13_grp aftertaste18_grp aftertaste23_grp ///
        gan3_grp gan8_grp gan13_grp gan18_grp gan23_grp mf3_grp mf8_grp mf13_grp mf18_grp mf23_grp
        
        foreach var of varlist $myvars1 {
            ttest `var', by(spring)
            // DO SOMETHING ELSE HERE
        }
        My best guess is that the code was tripping up because you had var without the surrounding macro-dereferencing quotes in your -ttest- command. That caused Stata to look to perform a ttest on a variable named var, rather than on a variable named balance_ind, or fullness_ind, etc. Those quotes are not optional. Pay attention also to the quote that precedes var in the -ttest- command. It is not an ordinary quote. It is the special downsloping quote that, on a US keyboard, is found to the the left of the 1! key.

        The use of -foreach var in $myvars1 {- is permissible. But it's less desirable. By using -foreach var of varlist $myvars1-, Stata will first check the contents of global macro myvars1 and verify that each entry there is in fact the name of a variable. If it finds something that isn't, it will stop and give you a specific error message. You can then fix the code defining myvars1. Good coding practice is to have your code fail as early after a problem arises as possible, leaving you that much less to undo as you fix it.

        Now, I put // DO SOMETHING ELSE HERE inside the loop because your -matrix means- command makes no sense. It makes no sense because -ttest- doesn't leave behind any r(table). So your matrix means is just going to be empty at the end of the loop. Since r(table) doesn't actually exist, I don't know what you're actually trying to accomplish here. If all you want to do is calculate the means of all these variables, disaggregated by the dichotomous variable spring, there are better ways to do that. But I'm not going to go there since I'm not sure that's what you really want to do. Moreover, even if you want to accumulate those means, putting them into a matrix may not be a particularly useful way to do it. So if you explain what you're actually trying to calculate and what you plan to do with it once you have it, you can probably get better code.

        Now let's discuss local and global macros. Global macros are inherently unsafe as a coding practice and should be used only as a last resort, when no alternative is available. Holding a list of variables is not one of those situations. I have been using Stata daily since 1994 and in that entire time have found a need to use global macros only twice. That's how uncommon it is.

        From what you wrote, I'm inferring that you prefer global macros so that you can run your code one line at a time without losing the variables in the varlist. My response to that is that you shouldn't be running code one line at a time like that. Place your local macros in the code close to where they are first used. Then run the whole block from the definition of the local macro to the place where it is used. If there is a lot of intervening code that is time-consuming to run repeatedly, you can comment it out while you are testing your code, and then un-comment it once you're satisfied.

        If that seems like a big rigmarole, let me just tell you that if you even once get bitten by a bug caused by a clash between global macro names in different programs, you will find it an experience that you never want to repeat as long as you live because it is so frustrating to chase that down and fix it. That angst and the time you lose then will outweigh a lifetime of the slight inconvenience associated with using local macros. Trust me, I learned that the hard way (in another language that had a similar trap, before I started using Stata).

        Comment


        • #5
          Dear Clyde,

          Thank you for all your help! I have switched the code over to local, thank you for your helpful warning on the difficulties of global! What I am trying to do is calculate if the mean ranking of each 'spring' tea attribute is statistically different from the mean ranking of these attributes for the 'monsoon season' tea, and then import that information into a excel sheet. Would the code be similar to something like this?

          foreach var of varlist `myvars1' {
          ttest `var', by(spring)
          matrix ttest= (r(mu_1), r(N_1), r(mu_2), r(N_2), etc., r(p))
          matrix rownames ttest= `var'
          matrix colnames ttest= mean1 N1 mean2 N2 etc., pscore
          mat2txt, matrix(ttest) sav(""/Users/hannahkitchel/Desktop/RA with Sean/setmeans.xls"), append
          }

          Comment


          • #6
            That is one way to do it, assuming mat2txt (not an official Stata command, and I'm not familiar with it) exports to .xls files.

            Another way that might be simpler, and avoids having to store anything in a matrix as an intermediary, would be to use -putexcel- after your -ttest- command. The -putexcel- approach has the drawback that you would have to track and update which rows or columns of the spreadsheet you are writing to each time through the loop.

            Yet another way would be to setup a -postfile- and -post- the -ttest- results during each iteration of the loop, then -export excel- the postfile after the loop is done. Something like this:

            Code:
            capture postutil clear
            tempfile results
            postfile handle str32 variable float(n1 mu1 n2 mu2 t p) using `results'
            
            foreach var of varlist `myvars1' {
                ttest `var', by(spring)
                post handle ("`var'") (`r(N_1)') (`r(mu_1)') (`r(N_2)') (`r(mu_2)') (`r(t)') (`r(p)')
            }
            postclose handle
            
            use `results', clear
            export excel using setmeans.xls, firstrow(variables) replace
            Added: Not tested, beware of typos, etc.
            Last edited by Clyde Schechter; 01 May 2018, 10:02.

            Comment


            • #7
              How can Clyde Schechter's code above be adjusted such that the exported results only report 3 decimal places for mu1 mu2 t and p?

              Comment


              • #8
                #7 seems best answered by applying some switch in MS Excel or other software that works with such files to affect what Stata calls display format. Sorry, but I have no idea what that might be.

                Wanting to round to multiples of 0.001 is fraught, as computers work in binary, not decimal.

                Comment


                • #9
                  Change the line:

                  post handle ("`var'") (`r(N_1)') (`r(mu_1)') (`r(N_2)') (`r(mu_2)') (`r(t)') (`r(p)')
                  to

                  Code:
                  post handle ("`var'") (`r(N_1)') (`:di %9.3f r(mu_1)') (`r(N_2)') (`:di %9.3f r(mu_2)') (`:di %9.3f r(t)') (`:di %9.3f r(p)')

                  Comment


                  • #10
                    Although Andrew Musau answered the question nicely and gave what you asked for, watch out. For example, with his solution P-values less than 0.0005 will be shown as 0.000.

                    That is what you asked for, but downstream of this someone -- supervisor, examiner, reviewer -- may want to see more significant figures. Then at worst you have to run the Stata code all over again.

                    Comment


                    • #11
                      Amazing! Thanks so much Andrew you made my day. And Nick, you are right. One needs to be cautious. May I also ask how the code can be changed if I wanted to ttest for 3 groups and have four extra columns in the excel file with values for n3, mu3, t3 and p3 for the extra group? So compare group1 means with group2 and group1 with group3? I did the following that does not work. I obviously need to append results but can't figure out how.

                      Code:
                      gen group2= if damage=0
                      replace group2=1 if damage==1
                      gen group3= if damage=0
                      replace group3=1 if damage==2
                      Code:
                      capture postutil clear
                      tempfile results
                      postfile handle str32 variable float (n1 mu1 mu2 t p n3 mu3 t3 p3) using `results'
                      
                      foreach var of varlist `myvars1' {
                               ttest `var', by (group2)
                               post handle ("`var'") (`r(N_1') (`r(mu_1)') (`r(N_2') (`r(mu_2)') (`r(t)') (`r(p)')
                               ttest `var', by (group3)
                               post handle ("`var'") (`r(N_3') (`r(mu_3)') (`r(mu_2)') (`r(t3)') (`r(p3)')
                      }
                      postclose handle
                      use `results', clear expor excel using setmeans.xls firstrow(variables) replace
                      Last edited by Fathima Salih; 13 Apr 2021, 07:23.

                      Comment


                      • #12
                        Maybe something like this (not tested):

                        Code:
                        capture postutil clear
                        tempfile results
                        postfile handle str32 variable float (n1 mu1 n2 mu2 t p n3 mu3 n4 mu4 t2 p2) using `results'
                        
                        foreach var of varlist `myvars1' {
                                 ttest `var', by (group2)
                                 local N1 = `r(N_1)'
                                 local mu1 = `r(mu_1)'
                                 local N2 = `r(N_2)'
                                 local mu2= `r(mu_2)'
                                 local t= `r(t)'
                                 local p= `r(p)'
                                 ttest `var', by (group3)
                                 post handle ("`var'") (`N_1') (`mu_1') (`N_2') (`mu_2') (`t') (`p') (`r(N_1)') (`r(mu_1)') (`r(N_2)') (`r(mu_2)') (`r(t)') (`r(p)')
                        }
                        postclose handle
                        use `results', clear expor excel using setmeans.xls firstrow(variables) replace
                        Last edited by Andrew Musau; 13 Apr 2021, 08:33.

                        Comment


                        • #13
                          Yes, let me try out something similar. Many thanks.

                          Comment


                          • #14
                            Could we include "var label" on this code instead "var name" for the "variable"?

                            Comment

                            Working...
                            X