Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple varlists and foreach

    Hi all,
    I'm trying to produce a few scatterplots with the foreach command. The idea is to have the first variable of varlist1 plotted against the first variable of varlist2, then the second variable of varlist1 against the second element of varlist2, and so on ... So I would like to end up with 4 plots (not 16).
    I couldn't find any hints on how to do this and tried it like this:
    local varlist1 A1 A2 A3 A4
    local varlist2 B1 B2 B3 B4
    foreach var of varlist1 & var2 of varlist2 {
    scatter `var' `var2'
    }

    Stata returns "invalid syntax". Any ideas on how to do this would be highly appreciated!

    Thanks, Tobias



  • #2
    Tobias,

    Try:

    foreach sfx in1 2 3 4 {
    scatter A`sfx' B`sfx'
    }

    John

    Comment


    • #3
      John has the right idea. In this example it could be just

      Code:
       
      forval j = 1/4 { 
           scatter A`j' B`j' , name(`j') 
      }
      I added a name() option: otherwise each plot just disappears as the next is drawn.

      Comment


      • #4
        Tobias,

        This, IMO, is a weakness of Stata compared to SAS (for example): the inability to create an array from a list of variables and index the array numerically. It works fine if your variables have numerical suffixes, as in your example and John's solution. However, if your variables do not have numeric suffixes, you have to resort to some tricks. Here is one workaround (I think I have seen other solutions on Statalist in the past but I don't remember what they are):

        Code:
        sysuse auto, clear
        
        local varlist1 "price mpg headroom trunk"
        local varlist2 "weight length turn displacement"
        
        forvalues i=1/4 {
          local var1 = word("`varlist1'", `i')
          local var2 = word("`varlist2'", `i')
          scatter `var1' `var2'
        }
        One could also use the unab command to populate the varlist1 and varlist2 macros if desired:

        Code:
        unab varlist1 : price-trunk
        unab varlist2 : weight-displacement
        This requires caution, however, to make sure that the number of variables in each list is the same and matches up in the desired fashion. In this example, for instance, there is an extra variable (rep78) in the first list that I don't really want.

        Regards,
        Joe

        Comment


        • #5
          Oddly, processing parallel lists was exactly what the now undocumented (really, non-documented) for was designed to do, back in Stata 7 and a few versions earlier.

          It may be a case of toolkits rather than problems, but I don't see such problems often in practice.

          There are other solutions, but they won't please Joe that much. There was some discussion in http://www.stata-journal.com/sjpdf.h...iclenum=pr0009

          Here's one technique:

          Code:
          sysuse auto, clear
          
          tokenize "price mpg headroom trunk"
          local varlist2 "weight length turn displacement"
          
          forvalues i=1/4 {
               local var2 = word("`varlist2'", `i')
               scatter ``i'' `var2'
          }

          Comment


          • #6
            Thanks everyone! Issue resolved. Since my actual variables do not all have numerical suffixes I used Joe's solution.
            Best, Tobias

            Comment


            • #7
              Nick,

              "Toolkits" is a good word for what is needed (absent any official solution from Stata). You are right that this problem doesn't arise very often in practice, nor is it that difficult to solve, but when it does come up it would be good to have a readily available toolkit of solutions at hand without having to search the internet or re-invent the wheel. I don't know where or how one would maintain such a thing, but it might be worth a separate discussion.

              Regards,
              Joe

              Comment


              • #8
                The benchmark for any improvement is that it must seem simpler than (mix of code and pseudocode)

                J is length of each list

                forval j = 1/J {
                local a : word `j' of local A
                local b : word `j' of local B
                <other selections>

                <something involving a, b, ...>
                }

                Comment


                • #9
                  Originally posted by Joe Canner View Post

                  This, IMO, is a weakness of Stata compared to SAS (for example): the inability to create an array from a list of variables and index the array numerically. ......
                  Joe, arrays are an integral part of Stata, and e.g. the graphics subsystem would be completely impossible without them. You can perfectly well define an array of anything, and index it numerically, whether it is a number, string, point or graph. Unfortunately even the advanced Stata courses shy away from classes, arrays and plugins, preferring to deal with merge, append, and collapse.

                  Some 15 years ago Bill Gould has posted an article on SAS-like arrays in Stata, which might be also of interest.

                  Best, Sergiy Radyakin

                  Comment


                  • #10
                    Sergiy (and Nick),

                    Personally, I don't find anything particularly "SAS-like" in the syntax: local x : word ‘i’ of ‘array’

                    Granted, it is not that difficult to do, once you are exposed to it, but it is not that intuitive and is somewhat clumsy (especially when you get beyond one dimension). I don't know what is going on under the hood, but given that Stata already uses array notation to refer to observations and already uses matrices to great effect, it doesn't seem like it would be that much of a stretch to actually use array/matrix notation in this context.

                    Code:
                    array A[4] var_one var_two var_three var_four
                    forval j=1/4 {
                       replace A[`j']=0   // Stupid example
                    }
                    The problem, I suppose, is how to deal with conflicts with the existing notation for referring to specific observations (var[`i']) and specifying both a specific observation and a specific variable in an array.

                    All that said, I am not going to hold my breath waiting for this change, nor whine (too much) about the lack. My only point is that the workaround is not as intuitive and "SAS-like" as it's made out to be.

                    Regards,
                    Joe

                    Comment


                    • #11
                      Being SAS-like in detail is no more important for Stata than (I guess wildly) being Stata-like is important for SAS. That's not meant to be deprecatory or dismissive in any direction, but it's the way it is.

                      Comment


                      • #12
                        Agreed; Bill Gould said as much at the Stata Conference in New Orleans. On the other hand, if SAS does something in a way that is (arguably) more elegant or intuitive, there is nothing wrong with Stata adopting the idea, simply because it is a good idea, not because SAS does it. There are not many examples of where Stata falls short in this regard, but this is one of them, in my opinion.

                        Comment


                        • #13
                          I barely know SAS from SPSS, but I do know that lookup tables are extremely common in programming, and Stata doesn't really support them at the macro level. This problem comes up often for me; in some contexts, judicious variable naming does the job, otherwise I usually I use Joe's solution or create pairings using -char- (to give one variable the characteristic [pair] equal to another). But none of these is as satisfactory as a proper array would be - allowing matrices to contain strings would solve the problem very simply (from the user perspective!).

                          Comment


                          • #14
                            Mata code:
                            Code:
                              A=tokens("var_one var_two var_three var_four")
                              for(j=1;j<=4;j++) {
                                A[1,j] // or use otherwise
                              }
                            Lookup tables in Stata are called associative arrays:

                            Code:
                              // create population lookup table
                              T=asarray_create()
                              asarray(T,"US",313.9)
                              asarray(T,"UK",63.23)
                              asarray(T,"UA",45.59)
                              // use
                              asarray(T,"US")
                            For example, Tobias could solve his problem with the help of:
                            Code:
                              G=asarray_create()
                              asarray(G,"price","weight")
                              asarray(G,"mpg","rep78")
                              asarray(G,"length","trunk")
                             
                              for(g=1;g<=asarray_elements(G);g++) {
                                y=asarray_keys(G)[g,1]
                                x=asarray(G,y)
                                stata(sprintf("twoway scatter %s %s",y,x))
                              }

                            Comment


                            • #15
                              Originally posted by Joe Canner View Post
                              Sergiy (and Nick),

                              Personally, I don't find anything particularly "SAS-like" in the syntax: local x : word ‘i’ of ‘array’

                              Granted, it is not that difficult to do, once you are exposed to it, but it is not that intuitive and is somewhat clumsy (especially when you get beyond one dimension). I don't know what is going on under the hood, but given that Stata already uses array notation to refer to observations and already uses matrices to great effect, it doesn't seem like it would be that much of a stretch to actually use array/matrix notation in this context.

                              Code:
                              array A[4] var_one var_two var_three var_four
                              forval j=1/4 {
                                  replace A[`j']=0 // Stupid example
                              }
                              The problem, I suppose, is how to deal with conflicts with the existing notation for referring to specific observations (var[`i']) and specifying both a specific observation and a specific variable in an array.
                              Sergiy mentioned above that Stata does have arrays in the context of class programming. It does, and they can actually be used outside of formal Stata classes. Joe's pseudo-code example above can be accomplished in Stata like this:

                              Code:
                              .A = { "var_one", "var_two", "var_three", "var_four" }
                              forval j=1/4 {
                                  replace `.A[`j']' = 0
                              }
                              See [P] classman for more details -- search within it for 'array'.

                              Note that .A must exist as an array before individual elements can be assigned to it. That is, if you wish to assign things one at a time to it, the syntax would be

                              Code:
                              .A = {}
                              .A[1] = "var_one"
                              .A[2] = "var_two"
                              ...

                              Comment

                              Working...
                              X