Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Foreach code is not limiting to varlist

    Hello,

    I am sorry if this is something simple I am missing, and thank you in advance for your thoughts. I have a data set with 50+ variables, and I want to destring 20 of those variables and do not want to do anything to the other variables. When I destring, I also need to ignore(".,"). So I am attempting to write a simple foreach loop and limit this to the 20 variable columns. I have tried multiple different ways of writing this code, but every way I write the foreach loop it cycles through all variables in my data set (which I do not want). Here are some examples of the foreach code attempts I have made. (the varlist is u1 - u20). For some reason every way I attempt this Stata runs through every column and does the operation which is in brackets. I think I must be missing something here, but I just don't see it. Thank you again for your attention here.

    foreach var in varlist u1 u2 u3 u4 u5 u6 u7 u8 u9 u10 u11 u12 u13 u14 u15 u16 u17 u18 u19 u20 {
    destring, replace ignore(".,")
    }


    I have also tried....

    foreach v in u1-u20 {
    destring, replace ignore(".,")
    }

    And also tried....


    foreach v in varlist u1-u20 {
    destring, replace ignore(".,")
    }


    I also tried to use a local macro:

    local vars u1 u2 u3

    foreach var of local vars {
    destring, replace ignore(".,")
    }

  • #2
    The problem is you're not referencing `var' in the destring command, so it defaults to destringing every variable on every iteration of the loop. In any event, you don't need a loop:

    Code:
    destring u1-u20, replace ignore(".,")

    Comment


    • #3
      Is it same when you use "foreach var of varlist" instead of "foreach var in varlist"?

      Comment


      • #4
        Everyone's right here. destring will loop for you. Otherwise it's a choice between

        Code:
        foreach v of varlist ...
        or

        Code:
        foreach v of local ...
        or

        Code:
        ...
        or

        Code:
        foreach v in ....
        See https://journals.sagepub.com/doi/pdf...36867X20976340 for a basic tutorial that may help. in and of are distinct keywords and not interchangeable.

        Comment


        • #5
          -foreach whatever of- must always be followed by the one of the keywords -local-, -global-, -varlist-, -newlist-, or -numlist-. Those keywords must then be followed by the name of a local or global macro, a list of variable names in the active data set, a list of name that can be used to create new variables in the active dataset, or a valid numlist, respectively. Anything else will precipitate a syntax error. In the particular case of -varlist-, Stata will expand wildcards in the varlist before iterating the loop. (Similarly, abbreviated numlists will be expanded.)

          -foreach whatever in- can be followed by anything. If that anything happens to be a list of variable names in your data set, that's fine, and things will proceed pretty much as if you had used -foreach whatever of varlist-. But, if that anything is a list of variable names abbreviated by wildcards, those will not be expanded, and if the subsequent code involves commands that require `whatever' to be the actual name of a variable, you will get errors at that point in the code. Also, in this circumstance, if what follows -in- contains something other than the name of an actual variable (e.g. the keyword varlist), that, too will trigger a an error when you hit the command(s) that expect `whatever' to be a variable name (unless you have a variable in your data set whose name is varlist).

          So -foreach var in u1 u2 u3 u4 u5 u6 u7 u8 u9 u10 u 11 u12 u13 u14 u15 u16 u17 u18 u19 u20 {- is OK.
          -foreach var of varlist u1-u20 {- is evidently simpler and easier to both type and understand. -foreach var in u1-u20 {- will lead to an error in any command that needs `var' to be a variable name, because u1-u20 is not a variable name and does not get expanded as if it were using wildcard abbreviation.
          And -foreach var in varlist u1 u2 etc.- will also lead to errors in that situation, unless varlist is actually the name of a variable in your data set.

          Added: Crossed with #4.

          Comment


          • #6
            Instead of foreach, suggest using forval as there is the sequence

            forval i=1/20 {
            destring u`i', replace
            }

            Comment


            • #7
              Hello, I wanted to provide a response here after reading through the suggestions here. I also wanted to say thank you for your time and let you know that there is no more follow up required here but I just wanted to leave a response for all of you. It seems that the solution that Thein put forth works the best for my specific data set. I have a feeling that there may be something else going on here though as it just seems that the other proposed suggestions and some of those which I did try should work correctly. I was not actually getting errors as the issue, but the loop was running through more variables than I wanted it to (basically every variable in my data set).

              -Clyde, Nick, Ali, and Selim - unfortunately I seem to have this issue where I loop through all of my variables and attempt to destring regardless of which variation of foreach var in or foreach var in varlist I use (including those suggested above). When I try the variations of foreach var of varlist ... it seems that the loop keeps going to a point where I have to just stop the loop. I am not sure why that is happening. When I try to reference `var' as proposed by Ali I seem to get an invalid syntax error.

              The idea of doing this with destring u1...u20 alone would also work but something that I should have specified better from the beginning is that I have variables in between my u1-u20 variables .. the data set is actually this pattern u1 r1 d1 u2 r2 d2....., so when I just destring u1-u20.. once again tries to destring every variable between those as well. Of course I could just type out destring u1 u2 u3.... but I suppose part of this inquiry was to prepare for the possible scenario where I have u1 -u100 mixed in among other variables.

              In any case, thank you again for your help and time.

              -

              Comment


              • #8
                the data set is actually this pattern u1 r1 d1 u2 r2 d2....., so when I just destring u1-u20.. once again tries to destring every variable between those as well. Of course I could just type out destring u1 u2 u3.... but I suppose part of this inquiry was to prepare for the possible scenario where I have u1 -u100 mixed in among other variables.
                In addition to Thein Zaw 's solution, which is easily adapted to any number of consecutively numbered variables, there are two other approaches you could take:

                1. You can use the -order- command to change the ordering of the variables so that the u variables are a contiguous block. E.g. -order u*, last-. Of course, if you need the variables in the order you have them now, this wouldn't be a good idea.

                2. If u1 through u100 are the only variables that begin with u, you can do -destring u*, replace-.

                More generally, in Stata, explicit looping is often unnecessary. When thinking about writing a loop, first ask yourself if it could be done with -by- or -runby-. Then consider whether there is a suitable specific command that iterates over a varlist, as -destring- does; there are many such commands in Stata.
                Last edited by Clyde Schechter; 27 May 2021, 23:08.

                Comment


                • #9
                  Code:
                  destring u*, replace
                  will work regardless of whether the u* are in order.

                  Here's a demonstration:


                  Code:
                  * Example generated by -dataex-. For more info, type help dataex
                  clear
                  input str1 u3 float frog str1 u2 float(toad newt) str1 u1
                  "3" 42 "2" 3.14159 2.718282 "1"
                  end
                  
                  . destring u*, replace
                  u3: all characters numeric; replaced as byte
                  u2: all characters numeric; replaced as byte
                  u1: all characters numeric; replaced as byte
                  
                  . list
                  
                  +------------------------------------------+
                  | u3 frog u2 toad newt u1 |
                  |------------------------------------------|
                  1. | 3 42 2 3.14139 2.718282 1 |
                  +------------------------------------------+

                  Comment


                  • #10
                    Hello, Thank you for the follow up comments Nick and Clyde... useful information to have!

                    Comment

                    Working...
                    X