Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using Stata to delete files from folders

    Dear All,

    This may come as an odd request however, I am trying to understand whether the following can be addressed using Stata. I will try to be as descriptive as possible.

    I do not have data for this as I am trying to understand a general concept of going about this.

    Suppose I have a CSV file that contains a variable called ID (string) and is unique on it. This CSV file contains list of all those ID's whose documents (contain ID number in document name) can be stored in either Folder 1 or Folder 2. Both these folders are stored in Folder 0.

    I would like to see whether I can write a code that picks each ID from the CSV, looks for that persons file in either of the 2 folders, and if found, deletes it from the directory. After it is done deleting, it displays all those ID's that were removed from the folder.


  • #2
    Sure, it can be done in Stata. It would look something like this:
    Code:
    local folder1 "your/path/here"
    local folder2 "your/path/here"
    
    import delimited "path/to/csv_file.csv"
    levelsof ID, local ids
    foreach id of local ids {
        local id_is_erased 0
        foreach folder in "`folder1'" "`folder2'" {
            fs "`folder'/*`id'*"
            if "`r(files)'" != "" {
                local id_is_erased 1
                foreach file in `r(files' {
                    local erase_list `"`erase_list' "`folder'/`file'""'
                    erase "`folder'/`file'"
                }
            }
        }
        if `id_is_erased' == 1 local ids_erased "`ids_erased' `id'"
    }
    
    di "The following files were erased:"
    foreach file of local erase_list {
        di "`file'"
    }
    
    di "Files were removed with the following id\'s:"
    foreach id of local ids_erased {
        di "`id'"
    }
    Last edited by Wouter Wakker; 26 Oct 2020, 11:16.

    Comment


    • #3
      This looks really nice! Though i have a followup question regarding it.

      What if the file name contained more than just the ID? how will i match it? i am thinking maybe use strpos however I am not sure how that will fit in this loop. Though if you have a better suggestion then feel free to let me know.

      Comment


      • #4
        The line
        Code:
        fs "`folder'/*`id'*"
        matches anything that contains the id. The "*" is a wildcard which means anything (or nothing). Is that what you want?
        Last edited by Wouter Wakker; 26 Oct 2020, 13:16.

        Comment


        • #5
          Yes this is it. Will give it a try and let you know sometime later. Thank you so much

          Comment


          • #6
            Wouter Wakker hope you are well. I tried running your code however Nothing gets deleted. The following code I used below:

            Code:
            cd ".../Test"
            
            local folder1 "./folder1"
            local folder2 "./folder2"
            
            import delimited "./remove_from_blank.csv", case(preserve) encoding(UTF-8) clear
            levelsof ID, local(ids)
            foreach id of local ids {
                local id_is_erased 0
                foreach folder in "`folder1'" "`folder2'" {
                    fs "`folder'/*`id'*"
                    if "`r(files)'" != "" {
                        local id_is_erased 1
                        foreach file in `r(files)' {
                            local erase_list `"`erase_list' "`folder'/`file'""'
                            erase "`folder'/`file'"
                        }
                    }
                }
                if `id_is_erased' == 1
                local ids_erased "`ids_erased' `id'"
            }
            
            di "The following files were erased:"
            foreach file of local erase_list {
                di "`file'"
            }
            
            di "Files were removed with the following id\'s:"
            foreach id of local ids_erased {
                di "`id'"
            }
            I think the problem may be in the following 2 lines of code:

            Code:
                            local erase_list `"`erase_list' "`folder'/`file'""'
                            erase "`folder'/`file'"
            It will be really helpful if you can kindly let me know how to address it. I have attached the folder with example files and .csv file for reference using the following link

            https://drive.google.com/file/d/1b7I...ew?usp=sharing

            Comment


            • #7
              Sorry, the mistake was here:
              Code:
              if "`r(files)'" != ""
              which should be
              Code:
              if `"`r(files)'"' != ""
              This mistake caused that nothing inside the if statement was executed.

              Another thing; I see that you made changes to this line:
              Code:
              if `id_is_erased' == 1 local ids_erased "`ids_erased' `id'"
              It was okay to put it on one line as I originally posted in #1, since it's only one command that's being executed. If you want to make it a multiline statement you should use braces {}.
              Last edited by Wouter Wakker; 27 Oct 2020, 03:07.

              Comment


              • #8
                Thank you for the insight however, once again, I am not sure what the reason is, but the code seems to not delete files.

                I noticed that when i display the local folder1 or folder2 it shows empty. I used the following code and got following output:

                Code:
                local folder1 "C:/Users/Fahad Mirza/Desktop/Test/folder1"
                local folder2 "C:/Users/Fahad Mirza/Desktop/Test/folder2"
                
                import delimited "C:\Users\Fahad Mirza\Desktop\Test\remove_from_blank.csv", case(preserve) encoding(UTF-8) clear
                levelsof ID, local(ids)
                foreach id of local ids {
                    local id_is_erased 0
                    foreach folder in "`folder1'" "`folder2'" {
                        fs "`folder'/*`id'*"
                        if `"`r(files)'"' != "" {
                            local id_is_erased 1
                            foreach file in `r(files)' {
                                 local erase_list `"`erase_list' "`folder'/`file'""'
                                erase "`folder'/`file'"
                            }
                        }
                    }
                    if `id_is_erased' == 1 local ids_erased "`ids_erased' `id'"
                
                }
                
                di "The following files were erased:"
                foreach file of local erase_list {
                    di "`file'"
                }
                
                di "Files were removed with the following id\'s:"
                foreach id of local ids_erased {
                    di "`id'"
                }
                
                
                
                
                . do "C:\Users\FAHADM~1\AppData\Local\Temp\STD3a30_000000.tmp"
                
                . local folder1 "C:\Users\Fahad Mirza\Desktop\Test\folder1"
                
                . local folder2 "C:\Users\Fahad Mirza\Desktop\Test\folder2"
                
                . 
                . import delimited "C:\Users\Fahad Mirza\Desktop\Test\remove_from_blank.csv", case(preserve) encoding(UTF-8) clear
                (1 var, 5 obs)
                
                . levelsof ID, local(ids)
                10001 10002 10003 10004 10005
                
                . foreach id of local ids {
                  2.     local id_is_erased 0
                  3.     foreach folder in "`folder1'" "`folder2'" {
                  4.         fs "`folder'/*`id'*"
                  5.         if `"`r(files)'"' != "" {
                  6.             local id_is_erased 1
                  7.             foreach file in `r(files)' {
                  8.                  local erase_list `"`erase_list' "`folder'/`file'""'
                  9.                 erase "`folder'/`file'"
                 10.             }
                 11.         }
                 12.     }
                 13.     if `id_is_erased' == 1 local ids_erased "`ids_erased' `id'"
                 14. 
                . }
                
                . 
                . di "The following files were erased:"
                The following files were erased:
                
                . foreach file of local erase_list {
                  2.     di "`file'"
                  3. }
                
                . 
                . di "Files were removed with the following id\'s:"
                Files were removed with the following id\'s:
                
                . foreach id of local ids_erased {
                  2.     di "`id'"
                  3. }
                
                . 
                end of do-file
                Is it because the folder locals are not working?

                Comment


                • #9
                  Wouter Wakker I was also wondering, maybe there could be a way of using this command by the name of rmfiles by Author Lars Angquist to run a smaller loop and remove files? Have you ever had the opportunity of using it?

                  Comment


                  • #10
                    Just an update, I tried this loop and it seems to work however will really appreciate your input in how I can display the list of files deleted along with what else I can do to make this loop more generalized and better

                    Code:
                    local folder1 "C:/Users/Fahad Mirza/Desktop/Test/folder1"
                    local folder2 "C:/Users/Fahad Mirza/Desktop/Test/folder2"
                    
                    levelsof ID, local(ids)
                    foreach id of local ids {
                        foreach folder in "`folder1'" "`folder2'" {
                            rmfiles, folder("`folder'") match("*`id'*")
                            
                        }
                        
                    }
                    Last edited by Fahad Mirza; 27 Oct 2020, 05:04. Reason: Forgot to add quotation marks around *`id'*

                    Comment


                    • #11
                      I'm not familiar with rmfiles, so maybe someone else can comment on that.

                      Regarding the code in #1, the following works for me:
                      Code:
                      . // Setup
                      . clear
                      
                      . set obs 1
                      number of observations (_N) was 0, now 1
                      
                      . gen foo = .
                      (1 missing value generated)
                      
                      . 
                      . cd ".../Desktop/Test" // Set root directory
                      ...\Desktop\Test
                      
                      . mkdir Folder1
                      
                      . mkdir Folder2
                      
                      . 
                      . local folder1 "./Folder1"
                      
                      . local folder2 "./Folder2"
                      
                      . 
                      . // Create some documents
                      . export delimited "`folder1'/1234.txt"
                      file ./Folder1/1234.txt saved
                      
                      . export delimited "`folder2'/4321.txt"
                      file ./Folder2/4321.txt saved
                      
                      . export delimited "`folder2'/1111.txt"
                      file ./Folder2/1111.txt saved
                      
                      . 
                      . 
                      . local ids "1234 1111"
                      
                      . foreach id of local ids {
                        2.     local id_is_erased 0
                        3.     foreach folder in "`folder1'" "`folder2'" {
                        4.         fs "`folder'/*`id'*"
                        5.         if `"`r(files)'"' != "" {
                        6.             local id_is_erased 1
                        7.             foreach file in `r(files)' {
                        8.                  local erase_list `"`erase_list' "`folder'/`file'""'
                        9.                 erase "`folder'/`file'"
                       10.             }
                       11.         }
                       12.     }
                       13.     if `id_is_erased' == 1 local ids_erased "`ids_erased' `id'"
                       14. 
                      . }
                      1234.txt
                      1111.txt
                      
                      . 
                      . di "The following files were erased:"
                      The following files were erased:
                      
                      . foreach file of local erase_list {
                        2.     di "`file'"
                        3. }
                      ./Folder1/1234.txt
                      ./Folder2/1111.txt
                      
                      . 
                      . di "Files were removed with the following id\'s:"
                      Files were removed with the following id\'s:
                      
                      . foreach id of local ids_erased {
                        2.     di "`id'"
                        3. }
                      1234
                      1111

                      Comment


                      • #12
                        Thank you so much for me i needed to add a forward slash within the folder local before running fs (not sure why it wasnt working like this) but after that it worked. also rmfiles worked too so I am happy that you helped out with this code. Thanks once again!

                        Comment

                        Working...
                        X