Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem connecting levelsof, local() to foreach: loosing double quotations?

    I am trying to save a list of names and then loop through those names, using the code below. However, 1 or maybe 2 things seem to be going wrong.

    Code:
    levelsof names, local(names)
    di `names'
    
    foreach lname of local `names' {
        di "this loop is working"
    }
    After the first (levelsof) line, output looks (correctly) like this:
    `"ABBITT, CHANCE M"' `"ABBOTT-HALL, RILEY"' `"ABDALLA, JOSEPH M"' `"ABEL, KEITH J"'...

    After the second (di `names') I get this, where it looks like the double quotes have been lost (colors added for emphasis) and the names have been merged:
    ABBITT, CHANCE MABBOTT-HALL, RILEYABDALLA, JOSEPH MABEL, KEITH JACKERMAN...

    Though maybe that's just poor output, and the names ARE being maintained, because my loop gets this error:

    _"ABBITT, CHANCE M invalid name

    So, problematic, because the loop is not running. However, at least it IS referencing the 1st name of the sequence I hope I'm saving (`"ABBITT, CHANCE M"').

    So maybe the sequence is fine and I'm merely doing something wrong with the loop? I did try foreach lname of local "`names'" {} with no greater success.
    Thanks!

  • #2
    Code:
    help quotes##double
    Code:
    local names1 `" `"ABBITT, CHANCE M"' `"ABBOTT-HALL, RILEY"' `"ABDALLA, JOSEPH M"' `"ABEL, KEITH J"' "'
    local names2  `" "ABBITT, CHANCE M" "ABBOTT-HALL, RILEY" "ABDALLA, JOSEPH M" "ABEL, KEITH J" "'
    display `" `names1' "'
    display `" `names2' "'
    Res.:

    Code:
     display `" `names1' "'
      `"ABBITT, CHANCE M"' `"ABBOTT-HALL, RILEY"' `"ABDALLA, JOSEPH M"' `"ABEL, KEITH J"'  
    
    . 
    . display `" `names2' "'
      "ABBITT, CHANCE M" "ABBOTT-HALL, RILEY" "ABDALLA, JOSEPH M" "ABEL, KEITH J"
    Last edited by Andrew Musau; 22 Feb 2024, 11:33.

    Comment


    • #3
      Code:
      foreach lname of local `names'
      is not typically what any code needs.

      Code:
      foreach lname of local names
      is almost always natural.

      That said, I first wrote levelsof as levels and had to push a bit to get it adopted. Now there is a melancholy pleasure of seeing it used in circumstances where I wouldn't use it at all. Something like


      Code:
      egen long id = group(names), label 
      
      su id, meanonly 
      
      forval g = 1/`r(max)' { 
      
      }
      is often much easier to work with. See also Method 1 [NB!] of Stata | FAQ: Making foreach go through all values of a variable

      Comment


      • #4
        Thanks Andrew and Nick.

        Andrew -- I could specify the locals myself, yes, but there are actually 2,331 names in this list. So I need the list to be created by a command, hence my use of levelsof.

        Nick -- I am afraid that I need a list of strings not coded values, or at least, as I'm currently planning my work flow I need that.

        Background: My overarching goal is to loop through each unique name listed within a variable that lists hundreds of thousands of combinations of names and IDs. You can see a fake data example of what the original, combined-names-and-ids variable looks like here: https://www.statalist.org/forums/for...at-extractions

        I called it "FAKE DATA EXAMPLE", and Andrew and others helped me figure out how to create a new dataset with a variable that lists each unique name in a row. Great -- it's 2,331 names.

        The reason I want the unique name list is I have a set of commands I want to run for all observations holding EACH name in the original dataset. So my plan was to run code like (uniquenames being the variable holding the list of 2,331 unique names, and names being the variable that holds combinations of names and IDs in the original dataset):

        Code:
        use  "DatasetOfUniqueNames.dta", clear
        levelsof uniquenames, local(unames)
        
        use  "OriginalDataset.dta", clear
        foreach lname of local unames {
            if strpos(names, "`unames'") > 0 {
                 *set of commands I want to run on each observation that mentions a given name, to be done for each of the 2,331 names.
            }
        }
        Last edited by Leah Bevis; 22 Feb 2024, 12:38.

        Comment


        • #5
          Whoops, I realized that I'm writing syntax that's impossible there, but the concept is right. Would actually look like:
          Code:
          use  "DatasetOfUniqueNames.dta", clear
          levelsof uniquenames, local(unames)
          
          use  "OriginalDataset.dta", clear
          foreach lname of local unames {
               COMMAND1 if strpos(names, "`unames'") > 0 
               COMMAND2 if strpos(names, "`unames'") > 0 
               COMMAND3 if strpos(names, "`unames'") > 0 
          }
          Last edited by Leah Bevis; 22 Feb 2024, 14:31.

          Comment


          • #6
            In similar applications, I have found the options in levelsof very useful for creating inputs for Stata's regular expression functions. I strip off the quotes using -clean- and separate the names using pipes.

            Code:
            clear
            input str100 uniquenames
            "ABBITT, CHANCE M"
            "ABBOTT-HALL, RILEY" 
            "ABDALLA, JOSEPH M" 
            "ABEL, KEITH J"
            end
            
            levelsof uniquenames, local(unames) sep(|) clean
            if ustrregexm("JOHN, P. CHANCE", "(`unames')"){
                display "found JOHN" 
            }
            if ustrregexm("ABEL, KEITH J", "(`unames')"){
                display "found ABEL" 
            }
            Res.:

            Code:
            . levelsof uniquenames, local(unames) sep(|) clean
            ABBITT, CHANCE M|ABBOTT-HALL, RILEY|ABDALLA, JOSEPH M|ABEL, KEITH J
            
            . 
            . if ustrregexm("JOHN, P. CHANCE", "(`unames')"){
            . 
            .     display "found JOHN" 
            . 
            . }
            
            . 
            . if ustrregexm("ABEL, KEITH J", "(`unames')"){
            . 
            .     display "found ABEL" 
            found ABEL
            . 
            . }
            
            .

            Comment


            • #7
              The comments in #4 don't undermine the suggestion in #3 to loop over integers. The point of the label option to egen, group() is that the names can be looked up as value labels within the loop.

              I sense that the code as a whole is (much) more complicated than you're showing us and that you're just asking about what is stumping you. That's fine in itself, but it's also true that it is hard to advise when the inside of the loop is a black box. It's not even obvious that you need to loop at all.

              Comment

              Working...
              X