Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping over csv files

    Hi everyone, I am trying to loop over a bunch of csv files, make some changes and save them as csv files.
    Currently I have this code:
    local files : dir "C:\Users\mydir" files "*.csv"

    cd "C:\Users\mydir"

    foreach file of local files {
    import delimited `file', clear
    //do something
    export delimited using "`file'", replace
    }

    It gives me an error which I believe is because the file names have spaces in them. Could you help me fix this, please?

    Also, if I wanted to save them under a different name, rather than overwriting, how could I do that? Would something like
    export delimited using "`file'_new" work?

    Many thanks for your help.

    Regards,
    Laura Cojocaru

  • #2
    Try

    Code:
    import delimited "`file'", clear
    The problem with your other code is, I guess, that it would lead to filenames like foo.csv_new.csv although I haven't tried it. At a guess, this is closer to what you seek:

    Code:
    foreach file of local files {
    import delimited "`file'", clear
    //do something
    local new : subinstr local file ".csv" "_new.csv", all 
    export delimited using "`new'", replace
    }

    Comment


    • #3
      Try -import deimited `"`file'"', clear- at the top of the loop. It would also be safer to use compound double quotes around `file' in the -export delimited- command. Saving as "`file'_new" probably won't work because if `file' is "this_file.csv", then `file'_new will read as "this_file.csv_new", which is not what you want.

      The simplest way to save under a new name would be -export delimited using `"new_`file'"'. If you really want the "new" to be at the end of the filename (but, obviously, before .csv), then you could do:

      Code:
      local newfilename: subinstr local file ".csv" "_new.csv"
      export delimited using `"`newfilename'"', replace

      Comment


      • #4
        Many thanks for both your replies.

        I tried both options but I am still getting the same error as before when I was trying to simply replace the original file without changing the name. The error is: using required.

        local files : dir "mydir" files "*.csv"

        cd "mydir"

        foreach file of local files {
        import delimited `file', clear
        //do something
        local newfilename: subinstr local file ".csv" "_new.csv"
        export delimited using `"`newfilename'"', replace
        }


        Regards,
        Laura Cojocaru

        Comment


        • #5
          Are you using an old version of Stata? (Or not the latest update of the current version?) At one point, -import delimited- did require -using- before the filename. But that is not the case in current Stata.

          That said, you still don't have any quotes around `file' in your -import delimited- statement. If the filename itself contains blank spaces or has ordinary quotes (") around it, you may be confusing the parser and getting an uninformative error message about what is, nevertheless, incorrect syntax. Try -import delimited `"`file'"', clear-, as suggest earlier by both Nick and me.

          Comment


          • #6
            I have Stata 14 although I did not update it a while.
            It now works great with the quotes. I was convinced it was about the export part.
            To tell the truth I do not understand what the outermost ` ' does.
            Thank you very much! This was very helpful!
            Regards,
            Laura Cojocaru

            Comment


            • #7
              To tell the truth I do not understand what the outermost ` ' does.
              By using `"..."' instead of " ", you make it possible to refer to a string literal that may, itself contain quotation marks. Some of the macros returned by Stata commands have quotes in them. And sometimes you have a string variable whose value(s) may contain quotes. So if I go:


              Code:
              if "`quote_of_the_day'" == "He said "That's right."" {
                   // do something
              }
              the parser is going to go ballistic. Look at the quotation mark in front of That's. How can the parser know whether that's the close of the quote that began with He, or the start of a new, embedded quote? There is no way to tell because unlike paired braces, or single quotes (` and ') the beginning and end version of the double quote are the same character ("). Stata's solution to this dilemma is compound double quotes. You open with `" and then you close with "'. Stata "thinks" of `" as a single token, and likewise "'. The genius of it is that `" is different from "', so the dilemma doesn't arise:

              Code:
              if "`quote_of_the_day'" == `"He said "That's right.""' {
              Now the quotation mark in front of That's is clearly the start of an embedded quotation, because only "' can close what `" begins. Even better, you can nest compound double quotes within other compound double quotes:

              Code:
              if "`quote_of_the_day'" == `"He said `"That's right"'"' {
              is unambiguously parse as `"That's right"' inside `"He said ..."'.

              Now, the use of `" "' instead of " " is, strictly speaking, only necessary if what goes in between contains quotation marks. Filenames ordinarily don't contain quotation marks. But a Stata macro that is a list of file names may come with the file names already wrapped in quotation marks--depending on how it was generated--you can't be sure ahead of time. So since there is the possibility that quotation marks will occur in the quoted text, it is better to use the compound double quotes than ordinary double quotes.



              Comment


              • #8
                Thank you very much for this thorough explanation. It makes much more sense now.
                Regards,
                Laura Cojocaru

                Comment


                • #9
                  Sorry to come back with another question.

                  I was wondering, if in addition to changing the names of the files, I also wanted to save it in a new directory, would that be possible?

                  In other words: export delimited using "newdirectory"`"`newfilename'"', replace
                  Thank you again.
                  Regards,
                  Laura Cojocaru

                  Comment


                  • #10
                    Yes, you can do that. But it's a little bit tricky. You need a separator between the directory name and the filename. In Windows, the conventional separator is a backslash, but that causes problems because \` is interpreted by Stata's parser as a literal ` character rather than as the start of a local macro reference. So use the forward slash (/) as the separator. If you are on a Mac, then the forward slash is the standard separator anyway. In any case, something like this:

                    Code:
                    export delimited using `""My Documents/This Project/New Files/`newfilename'"', replace

                    Comment


                    • #11
                      It works great! Again, many thanks!
                      Regards,
                      Laura Cojocaru

                      Comment

                      Working...
                      X