Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using dir for *.txt files, that are then saved as *.dta files

    Hi all,

    Fancy new digs!

    I am trying to execute a look that pulls in all *.txt files in a folder and then saves each one in another folder as a *.dta file. I am having trouble with the code.

    This is what I have (more or less):

    **** start ado****

    cd "B:\OtherDatabase\folderA"
    local files : dir . files "*.txt"

    foreach f of local files {
    insheet using "`f'", delim("|") names clear //pulls in the .txt file
    * create a bunch of new variables, rename variables etc...
    cd "B:\OtherDatabase\folderB" //changes to a new folder
    save "`f'", replace //saves the file - but in this case, saves it as a *.txt file
    cd "B:\OtherDatabase\folderA" //back to the other folder to read in the next file
    }
    *

    **** end ado *****

    Ideally I would be able to have <save "`f'", replace> save a *.dta file but instead it saves a *.txt file because that is what is stored in `f'.

    So, the issue is I cannot figure out how to, on one hand read only the *.txt files in, while, on the other, save them as *.dta files, while preserving the name represented by the *.

    Any ideas?

    Thanks, in advance,

    Ben

    Ben Hoen
    Staff Research Associate
    Lawrence Berkeley National Laboratory
    Office: 845-758-1896
    Cell: 718-812-7589
    [email protected]

  • #2
    Somebody asked exactly this just a few days ago, but regardless the key is something like

    Code:
    local newf : subinst local f ".txt" ".dta", all 
    save `newf'
    On a different note: those backslashes will get you sooner or later: http://www.stata-journal.com/sjpdf.h...iclenum=pr0042

    Comment


    • #3
      Thanks Nick!

      So here is that post, to close the loop: http://www.statalist.org/forums/foru...oop-on-insheet

      I will take a look at that, and what you wrote (and about the backslashes).

      Thanks,

      Ben

      Comment


      • #4
        From my perspective this is extremely useful discussion. Out of curiosity, how difficult would be to write a command importing all text files to one dataset, equivalent to this R syntax? I should add that parallel execution is not that important from my point of view.
        Kind regards,
        Konrad
        Version: Stata/IC 13.1

        Comment


        • #5
          You asked about writing a command: do you mean that, or do you just mean writing code that does it?

          At its simplest it's basically the same question as here, with the addition of append or merge or whatever else is appropriate.

          (A metacomment is that expecting experienced Stata users to be highly fluent in R and instantly understand large chunks of R code is no more likely to hit the target than the converse.)

          Comment


          • #6
            Originally posted by Konrad Zdeb View Post
            From my perspective this is extremely useful discussion. Out of curiosity, how difficult would be to write a command importing all text files to one dataset, equivalent to this R syntax? I should add that parallel execution is not that important from my point of view.
            I should probably plug my filelist program from SSC. It is similar to Nick Cox's fs (from SSC) and to the dir extended macro function but creates a dataset of file names instead. It also will scan directories recursively. The help file shows an example of how to get a list of all csv files within a directory (recursively), insheet them, and combine them all into a single Stata dataset:

            Code:
            * get the path and filename for all csv files within the current working directory
            filelist, dir(".") pat("*.csv") save("csv_datasets.dta")
            
            use "csv_datasets.dta", clear
            
            local obs = _N
            forvalues i=1/`obs' {
            use "csv_datasets.dta" in `i', clear local f = dirname + "/" + filename insheet using "`f'", clear tempfile save`i' save "`save`i''"
            } use "`save1'", clear forvalues i=2/`obs' {
            append using "`save`i''"
            }

            Comment


            • #7
              My 2 cents. Assume you have created a local macro with the directory path called dir, that the source directory is the same as the destination directory, and that the text files are comma separated values files:

              Code:
              local fls : dir "`dir'" files "*.txt"
              foreach f in `fls' {
                  import delimited "`dir'/`f'"
                  local d = subinstr("`f'",".csv",".dta",.)
                  save "`dir'/`d'", replace
              }
              Alfonso Sanchez-Penalver

              Comment


              • #8
                Some correction to my prior message to fit your example better, and to fix a mistake
                Code:
                cd "B:\OtherDatabase\folderA"
                local fls : dir . files "*.txt"
                foreach f in `fls' {
                    insheet using "`f'", delim("|") names clear
                    local f = subinstr("`f'",".txt",".dta",.)
                    save "`f'", replace
                }
                


                Alfonso Sanchez-Penalver

                Comment


                • #9
                  I know I am being greedy right now, but how difficult would be to automatically merge all the files into one data set after saving them. The files have common id. Also, my apologies for posting links to the chunks of R code.
                  Kind regards,
                  Konrad
                  Version: Stata/IC 13.1

                  Comment


                  • #10
                    merge merges files two at a time. You could write a loop merging files one by one. The difficulty in, or the objection to, that is that it assumes that all will go smoothly.

                    But people often code optimistically to do this. They may then find, say, that it works except that one variable is string in some files and numeric in others. So they then need to catch the difficult cases and fix them before merging

                    By the way, I don't think that giving R code and asking for a translation is out of order in this forum. The practical point is only that the number of people here fluent in Stata and something else is always going to be much smaller than the number of people here fluent in Stata. Conversely, one person so fluent might be enough to answer.

                    Comment


                    • #11
                      Thanks all for your insightful comments\\!

                      Comment

                      Working...
                      X