Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error while using _N in forvalues loop

    I want to use _N in place of the total count of observations in a loop like this:

    forvalues i = 1(1)_N {

    }
    instead of directly putting the number. But I'm getting an "Invalid syntax" error. Can anyone help me with this or suggest any alternative?

  • #2
    You want to force Stata to evaluate the expression:

    Code:
    forvalues i = 1(1)`=_N' {

    Comment


    • #3
      Hi Rohit,

      The following should work:

      Code:
      local tot = _N
      
      forvalues i=1(1)`tot' {
      dis("`i'")
      
      }
      Best,
      Rhys

      Comment


      • #4
        Thanks a lot, Andrew and Rhys. Both the solutions work.

        Comment


        • #5
          Thank you, Andrew and Rhys. Will this work with a bysort command, meaning that a loop is run for each group of observations in a larger dataset with groups identified by some group variable?

          Comment


          • #6
            Backing up here, the implied loop is a loop over observations from 1 to _N (or its evaluation). That's rarely needed in Stata, however. Such a loop is usually automatic as in many examples such as

            Code:
            generate ln_price = ln(price)
            So, please expand #5 to give more details of the problem you're trying to solve with concrete details on real(istic) data and the code you have so far.

            Comment


            • #7
              hi why dont you count the total observations then do forval i=1/r(N){
              commands
              }

              Comment


              • #8
                Thank you, Nick
                So far I do not have any code, I am trying to figure out a solution to this problem:
                I have a dataset with up to 5 observations on each of a large number of patients. I need to collapse these to one record per patient to be able to merge it with some other datasets. Reshaping from long to wide would complicate matters. Many of the observations are dichotomous 1/0 (or .) and may be collapsed by egen max() or egen total(), but a few of the variables have several levels coded as integers. I would like to preserve these individual multilevel observations despite collapsing to one record. I was thinking of creating new variables mirroring the values but as strings instead of integers. And then concatenating the original observations into a single observation per patient containing all the original content as a string. By sorting on Patient ID I might run a loop for each patient from 1 to _N, concatenating the "tostring'ed" content of the variable by using a simple plus sign. I just don't know if it will work - all posts on the subject here concern the "major" _N, not the _N within a smaller subgroup of records.
                I acknowledge that this is not a very exact question, and that the above solution is somewhat "Rube Goldberg"-like. Perhaps there is a simpler and smarter way?
                Thank you for any help, and please forgive me
                Hans

                Comment


                • #9
                  It would be nice to know which variable you are collapsing by.Unless you want count of certain outcome, I don't understand the idea of collapsing individual data in what appears as repeated measures

                  Comment


                  • #10
                    Thank you, Fredrick. It's not really repeated measures, but rather samples collected at different occasions from the same patient. The individual observations are categorical data (e.g. "cancer", "not cancer", "doubtful", "slightly suspicious", "highly suspicious" etc). To analyze my data, I need to collapse all observations to one record per patient, but at the same time preserve some of the information from each of the original observations for a later, more detailed analysis.

                    Comment


                    • #11
                      If you want one row per person and keep the information, you have to reshape wide. There is no other way comply with both conditions.
                      ---------------------------------
                      Maarten L. Buis
                      University of Konstanz
                      Department of history and sociology
                      box 40
                      78457 Konstanz
                      Germany
                      http://www.maartenbuis.nl
                      ---------------------------------

                      Comment


                      • #12
                        I am afraid so, Maarten. But I am still struggling.. Thank you for your advice.

                        Comment


                        • #13
                          Problem solved.

                          ID = patient ID
                          recno = record number (1-5 for each patient)
                          Var = interesting variable with up to 5 observations per patient in long format

                          The up to 5 records per patient must be collapsed to one record while preserving the information encoded in each of the patient's observations for later use.

                          Solution:

                          Code:
                          tostring Var, gen(strVar)
                          sort ID recno
                          by ID: gen aggregatedVar=strVar if _n==1
                          by ID: replace aggregatedVar=strVar[_n]+" "+strVar[_n-1] if _n>1
                          by ID: replace aggregatedVar=strVar[_n]+" "+aggregatedVar[_n-1] if _n>2
                          by ID: replace aggregatedVar=aggregatedVar[_N]
                          Simple and robust.

                          Thank you all for your help.

                          Comment


                          • #14
                            Even simpler:

                            Code:
                            tostring Var, gen(strVar)
                            sort ID recno
                            by ID: gen aggregatedVar=strVar if _n==1
                            by ID: replace aggregatedVar=strVar[_n]+" "+aggregatedVar[_n-1] if _n>1
                            by ID: replace aggregatedVar=aggregatedVar[_N]

                            Comment


                            • #15
                              If you want to retain the numeric character of the original variable, you could also do something like:

                              Code:
                              sort ID recno
                              forval i = 1/5 {
                                  by ID: gen Var_`i' = Var[`i']
                              }
                              Last edited by Hemanshu Kumar; 24 Jun 2024, 02:54.

                              Comment

                              Working...
                              X