Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping over subfolders

    Hello,

    I am a beginner in Stata and have been through many threads on here but have been unable to find the exact help I'm looking for. I hope this is not a repetitive question.

    I have one big folder with 32 subfolders inside it. Each of the subfolders has this name format: "eic2015_01_dta", "eic2015_02_dta", and so on until 32.

    Within each subfolder, I have 2 dta files and 2 do files. The dta file names are formatted as such: "Tr_persona01.dta" and "Tr_vivienda01.dta", the number corresponding to the subfolder it is in. The do files are named "personas.do" and "viviendas.do". They are the same do files in every subfolder (this is just how it downloaded from the original source).

    For each subfolder, I am trying to do several operations. First, I want to run the "personas.do" in the persona dta file and save that as a new dta file, then I want to run the "viviendas.do" on the viviendas dta file and save that as a new data file, then I want to merge the new dta files.

    My latest attempts have gone something like this (this is just for the first part, where I run the do file in the dta file and save):

    Code:
    cd "."
    local subfolders : dir "." dirs "eic2015_*_dta"
    
    foreach folder in local subfolders {
        use "./`folder'/Tr_vivienda*.dta"
        do viviendas.do
        save "./`folder'/Tr_vivienda_`folder'.dta", replace
    }
    However, it keeps returning the error: file ./local/Tr_vivienda*.dta not found.

    I'm stuck. Am I supposed to be setting a cd that I'm not doing? Am I supposed to be doing a second loop within the loop for each file? How do I tell stata that the name for each dta file will depend on the subfolder it is in?

    Please be gentle, this is my first time using loops!

    Thank you in advance!

  • #2
    LiMaria Lopez I was successful with this code but it looks pretty similar to what you tried:

    cd "C:\....."

    local folderpath = "C:\......."

    foreach folder in local folderpath {

    ......

    }

    If you are going through subfolders within folders you probably need a loop within a loop for the subfolders

    Comment


    • #3
      Thanks for the reply, Tom. Any idea what this loop within the loop should look like? Still not working for me.

      Comment


      • #4
        I see that you cross-posted on Reddit https://www.reddit.com/r/stata/comme...ubdirectories/

        You are asked to tell us about cross-posting.

        I've posted occasionally on Reddit too. The total expertise here on Statalist is ... higher.

        The error message is that you can only use one Stata dataset at a time. You have a wildcard in your use command.

        Your task is quite complicated even for an experienced user, but I would tackle it more like this:

        Code:
        forval j = 1/32 { 
            local J : di %02.0f `j'
            cd eic2015_`J'_dta 
            foreach v in persona vivienda { 
                use Tr_`v'`J' 
                do `v's 
                save, replace 
            } 
            cd ..
        }




        Comment


        • #5
          #4 isn't correct in diagnosis. The error trapped arises from in local being incorrect for of local -- but I think ,I fixed the next error that would have bitten.

          Comment


          • #6
            Hi!
            I have some related question with looping over folders. I have a main folder with many subfolders, each with several yearly databases.
            I would like to loop over all the files in each subfolders but the code doesn't work.
            the main folder is "C:/Users/survey/Y_2006_on"
            The subfolders all have names XX_Y_2006_on, where only XX varies.


            I have tried two similar approaches, none has worked.

            Approach 1:
            Code:
            global y_folder         "C:/Users/survey/Y_2006_on"
            global y_subfolders :    dir "${y_folder}" dirs "*_Y_2006_on"
            
            cd "${y_folder}"
            
            foreach subfolder of global y_subfolders {
                import delimited "${y_folder}/`subfolder'/*.csv", clear
                * some data cleaning *
            }
            When I try this, nothing happens at all.

            Approach 2:
            Code:
            global y_folder         "C:/Users/survey/Y_2006_on"
            global y_subfolders :    dir "${y_folder}" dirs "*"
            
            global databases  "*.csv"
            
            
            cd "${y_folder}"
            
            foreach db of global databases {
                import delimited "`db'", clear
                *some data cleaning*
            }
            Error: *.csv not found - wildcard doesn't seem to be working.

            Any suggestions will be much appreciated!

            Comment

            Working...
            X