Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why is My Stata for-loop Code Wrong?

    I just want to compute descriptive statistics of the categorical time-invariant variables in the attached longitudinal data file.
    For the sake of efficiency, I used the following for-loop code to compute them.
    However, something is wrong and I couldn't find what places are incorrect.

    Thanks for your suggestions and help!

    use data.dta,clear
    local a male esl_pgm bil_pgm everspec hispanic
    foreach var of a {
    preserve
    bys id: keep if _n==1
    sum `var'
    restore
    }
    Attached Files
    Last edited by smith Jason; 12 Apr 2022, 17:34.

  • #2
    There is no Stata loop that goes -foreach var of a-, where a is a local macro. containing a list of variables. You have three ways you can iterate over a local macro that contains a list of variables:

    Code:
    foreach var of local a { ...
    
    OR
    
    foreach var in `a' { ...
    
    OR
    
    foreach var of varlist `a' {
    Now, once you fix that you will get past your syntax error message.

    But I have a concern about the code inside the loop as well. It seems that you are repeatedly preserving the data set, reducing it to the first observation of each id, and then summarizing those, and restoring and repeating for the next variable. All of that disk thrashing, sorting, and deleting are unnecessary. For that matter, you don't need to iterate one variable at a time: you can summarize them all in one fell swoop.

    Code:
    preserve
    by id, sort: keep if _n == 1
    summ `a'
    restore

    Comment


    • #3
      There is no Stata loop that goes -foreach var of a-, where a is a local macro. containing a list of variables. You have three ways you can iterate over a local macro that contains a list of variables:

      Code:
      foreach var of local a { ...
      
      OR
      
      foreach var in `a' { ...
      
      OR
      
      foreach var of varlist `a' {
      Now, once you fix that you will get past your syntax error message.

      But I have a concern about the code inside the loop as well. It seems that you are repeatedly preserving the data set, reducing it to the first observation of each id, and then summarizing those, and restoring and repeating for the next variable. All of that disk thrashing, sorting, and deleting are unnecessary. For that matter, you don't need to iterate one variable at a time: you can summarize them all in one fell swoop.

      Code:
      preserve
      by id, sort: keep if _n == 1
      summ `a'
      restore
      And if your data set is really large, you can even avoid the one preserve-restore cycle by doing:

      Code:
      by id, sort: gen byte include_me = (_n == 1)
      summ `a' if include_me
      Added: I have absolutely no idea how this turned into three posts, one of which is an incomplete version.

      Comment


      • #4
        Thank you for your response. However, what I need is the frequency and percent table instead of mean and std.

        Comment


        • #5
          Then instead of -summ- use -tab1-.

          By the way, in the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data. While attaching a .dta file to your post overcomes these limitations, some Forum members will not download attachments from people they do not know.

          Comment


          • #6
            Thank you for your suggestion! I will pay attention to it next time.

            Comment


            • #7
              I used the code below to compute them and it can work.
              However, the results display several times. I don't know why it happens.

              local a male esl_pgm bil_pgm everspec hispanic
              foreach var of local a {
              preserve
              by id, sort: keep if _n == 1
              tab1 `a'
              restore
              }

              Thank you for your help!

              Comment


              • #8
                The results display several times because that's what you told Stata to do. Look carefully at the various code solutions proposed in #3. When you use a -foreach var...- construct, inside the loop you -summ- (or, as we later learned, -tab1- `var'.) which then processes one variable at a time. But you used -tab1 `a'-, so each time through the loop you tabulated everything in the entire list.

                Comment


                • #9
                  Thank you! Then, how can I just display the frequency table one time in Stata code?

                  Comment


                  • #10
                    Code:
                    local a male esl_pgm bil_pgm everspec hispanic
                    foreach var of local a {
                    preserve
                    bys id: keep if _n == 1
                    tab1 `var'
                    restore
                    }

                    Comment


                    • #11
                      It still displayed 5 times one-way tabulation form for each variable.
                      Thank you!

                      Comment


                      • #12
                        Originally posted by Jared Greathouse View Post
                        Code:
                        local a male esl_pgm bil_pgm everspec hispanic
                        foreach var of local a {
                        preserve
                        bys id: keep if _n == 1
                        tab1 `var'
                        restore
                        }
                        Thank you! It works now.

                        Comment


                        • #13
                          So wait, I'm confused, you want to make one table per variable or make one table that does all these at once?


                          If you're using a loop, you necessarily want to do one particular individual action repeatedly. In this case that's 5.

                          As Nick Cox sometimes points out, I think we may have a slight XY problem, where we're more concerned about "using a loop" to do Y instead of just the best way to do Y. Let me summarize, if I may. You begin by saying
                          I just want to compute descriptive statistics of the categorical time-invariant variables in the attached longitudinal data file.
                          My original guess was just to use sum, but then you say
                          I need [] the frequency and percent table instead of mean and std.
                          and okay, fair enough. I'm not at my computer right now, so I'm sorry I can't immediately test this, but when I Google "tab1 stata", I see
                          tab1 produces a one-way tabulation for each variable specified in varlist.
                          So with this in mind, I guess my question is, "Why isn't the solution just "tab1 male esl_pgm bil_pgm everspec hispanic"? The loop seems to be a little... extreme, no?

                          I'm sorry if I'm being confusing, but I guess I'm confused about what you'd like for the result to be, because if I understand you well, you don't need a loop for this at all, right?

                          Comment


                          • #14
                            make one table that does all these at once is what I want.

                            Comment


                            • #15
                              In fact, I want to create the classic table 1 used on publication. You can google about table1 in stata and there are many information about it.
                              But my Stata version is Stata 16 rather than Stata 17.0.

                              Comment

                              Working...
                              X