Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Storing values to a list and expand the list in each iteration

    I would like store certain values to a list and drop if certain variable's value is equivalent to those values in list. Here is the code that I am working on:


    Code:
    local files : dir "C:\Users\E.YILMAZ\Desktop\www.thingiverse.com\Strata_Data\splitted_data" files "*.dta"
    cd "C:\Users\splitted_data"
    list = []
    foreach file in `files' {
        cd "C:\Users\splitted_data"
        use  `file', clear
        for each x in list {
           drop if id == `x'
           }
        keep if  over_threshold == 1
        add_new_values_to_list
        save "new_`file'", replace
    }
    Could you please help me on this matter?

  • #2
    Well, you have buried the most important part of what you do so that it becomes impossible to answer your question. The whole key is what you write as "add_new_values_to_list." But you tell nothing about how those values are created. So I'll just give you some generic advice.

    First, don't start your list with []. Start it as an empty string. Those brackets will be interpreted by Stata as among the values of `x' and that will surely trigger syntax errors. Also, don't name it list, because that is a reserved word in Stata, and it could be particularly confusing in the code you are going to need to use.

    Next, expanding the contents of a local macro is quite easy. So overall your code will look something like this:

    Code:
    local files: dir whatever
    local mylist  // INITIALIZED AS EMPTY STRING
    cd C:/Users/Splitted_data // NO NEED TO PUT THIS INSIDE THE LOOP; JUST DO IT ONCE
    
    foreach file of local files {
        use `file', clear
        foreach x of local mylist {
            drop if id == `x'
        }
        keep if over_threshold == 1
    
        // INSERT CODE HERE TO DETERMINE WHAT VALUES TO ADD TO THE LIST
        // FOR ILLUSTRATION I WILL ASSUME YOU ARE ADDING X1, X2, X3, and X4
        
        // ADD THEM TO THE LIST
        local mylist `mylist' X1 X2 X3 X4
    
        // REMOVE DUPLICATES FROM LIST
        local mylist: list uniq mylist
       
        save "new_`file'", replace
    }

    Comment


    • #3
      Thank you very much for your advice Clyde, I wanted to add new ids to the list, so I believe
      Code:
      local mylist `mylist' id
      would do the trick

      Comment


      • #4
        No, that will not do what you want. What that will do is add the character string "id" to local macro mylist. But it is clear from the rest of the code that what you actually want in mylist is some actual numerical values from variable id. You still give no indication how you will choose which values of id you want to include. But to illustrate the general approach, if you want to include all the values of id that are still surviing at that point in the code, you would do that as:

        Code:
        levelsof id, local(remaining_ids)
        local mylist `mylist' `remaining_ids'
        If you only want to include certain ids that meet some condition you have in mind, add an -if condition- clause to the -levelsof- command.

        Comment


        • #5
          Indeed, I would like to add all the remaining ids after keep if over_threshold == 1 but for some reason, it doesn't add all the ids to the list. The full code is here

          Code:
          local files: dir whatever
          local mylist  // INITIALIZED AS EMPTY STRING
          cd C:/Users/Splitted_data // NO NEED TO PUT THIS INSIDE THE LOOP; JUST DO IT ONCE
          
          foreach file of local files {
              use `file', clear
              foreach x of local mylist {
                  drop if id == `x'
              }
              keep if over_threshold == 1
          
              // INSERT CODE HERE TO DETERMINE WHAT VALUES TO ADD TO THE LIST
              // FOR ILLUSTRATION I WILL ASSUME YOU ARE ADDING X1, X2, X3, and X4
              
              // ADD THEM TO THE LIST
              levelsof id, local(remaining_ids)
              local mylist `mylist' `remaining_ids'
          
              // REMOVE DUPLICATES FROM LIST
              local mylist: list uniq mylist
              
             
              save "new_`file'", replace
          }
          Last edited by Erdem Yilmaz; 09 Aug 2020, 13:22.

          Comment


          • #6
            Somehow the code got mangled when you posted it. Please try again and make sure that what shows up after you post has appropriate line breaks and indentation so it is readable.

            Comment


            • #7
              Sorry, should be ok now

              Comment


              • #8
                Thanks. I can't see what the problem is. So let's do some diagnostic work:

                Code:
                local mylist  // INITIALIZED AS EMPTY STRING
                cd C:/Users/Splitted_data // NO NEED TO PUT THIS INSIDE THE LOOP; JUST DO IT ONCE
                
                foreach file of local files {
                    use `file', clear
                    foreach x of local mylist {
                        drop if id == `x'
                    }
                    keep if over_threshold == 1
                
                    // INSERT CODE HERE TO DETERMINE WHAT VALUES TO ADD TO THE LIST
                    // FOR ILLUSTRATION I WILL ASSUME YOU ARE ADDING X1, X2, X3, and X4
                   
                    // ADD THEM TO THE LIST
                    display `"`mylist'"'
                    levelsof id, local(remaining_ids)
                    local mylist `mylist' `remaining_ids'
                    display `"`mylist'"'
                
                    // REMOVE DUPLICATES FROM LIST
                    local mylist: list uniq mylist
                    display `"`mylist'"'
                   
                  
                    save "new_`file'", replace
                }
                Try running it as shown above, and review the output so you can see where the failure to add all the desired ids is coming, and what is being omitted. Then post back with summary information, and show an excerpt of the output that illustrates the problem. (Please don't post the output for the full loop over 35 files!)

                Comment


                • #9
                  At the moment, the code would drop observations before the keep-command. To be more precise, in the first iteration only the keep-command has an effect because the list is still empty. The list will be filled be with ids only after the keep-command and the ids will be used to drop observations in the next file.
                  This is probably not intended.
                  The corrected could look like this below.
                  An example of your dataset would be also useful to see if the code can be further simplified.
                  Code:
                  local mylist  // INITIALIZED AS EMPTY STRING
                  cd C:/Users/Splitted_data // NO NEED TO PUT THIS INSIDE THE LOOP; JUST DO IT ONCE
                  
                  foreach file of local files {
                      use `file', clear
                  
                      keep if over_threshold == 1
                  
                      // INSERT CODE HERE TO DETERMINE WHAT VALUES TO ADD TO THE LIST
                      // FOR ILLUSTRATION I WILL ASSUME YOU ARE ADDING X1, X2, X3, and X4
                     
                      // ADD THEM TO THE LIST
                      display `"`mylist'"'
                      levelsof id, local(remaining_ids)
                      local mylist `mylist' `remaining_ids'
                      display `"`mylist'"'
                  
                      // REMOVE DUPLICATES FROM LIST
                      local mylist: list uniq mylist
                      display `"`mylist'"'
                     
                      // Moved this loop from the beginning to here
                      foreach x of local mylist {
                          drop if id == `x'
                           }
                    
                      save "new_`file'", replace
                  }

                  Comment


                  • #10
                    Perhaps #9 is right. The effect of the code in #8 would be to keep the first file intact, except for those observations with over_threshold != 1. The remaining values of id after that would go into the list. For the second file, those observations where ID matches anything that was retained in the first file are dropped, along with any others having over_threshold != 1. Then the remaining values of id are added to the list, and so on. So each file in turn is purged of observations where id matches any value of a surviving id in any of the previous files. That seems like a perfectly reasonable, sensible thing one might want to do. Only O.P. knows if that was the intent.

                    Comment

                    Working...
                    X