Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to combine multiple .csv files contained in one folder?

    Hi,

    I would like to combine multiple .csv files contained in one folder. The name of the files changes and the number of files will increase over time (1 file per daily trading data, more days of trading in the future). I just need to stack them since the variables in each file is the same. Each file is relatively small in size (less than 1Mo). Can someone help? Thanks.

    Click image for larger version

Name:	csv_files.JPG
Views:	1
Size:	88.1 KB
ID:	1653958

  • #2
    Gotta append them. Import them, save em as tempfiles and append them.

    Comment


    • #3
      Code:
      clear
      local filenames: dir "." files "20220103-EOD_xcme*.csv"
      
      tempfile building
      save `building', emptyok
      
      foreach f of local filenames {
          import delimited using `"`f'"', clear
          gen file = `"`file'"'
          append using `building'
          save `"`building'"', replace
      }
      At the end of this, tempfile `building' will be in memory, and will contain all of the contents of those .csv files "stacked." You can then save it to a permanent file if you wish.

      Note: This code will work just fine if you assumption that all the files contain the same variables is true and the variables are of compatible data types in all the .csv files. This is an optimistic assumption, and in my experience, is rarely true in large sets of files. It is likely that at some point there will be some variable that is numeric in most of the files, but in one or a few it is, for some odd reason, a string.

      For my part, I think it better not to try to put all the files together at once but rather to save them as separate Stata data sets first and then use the -precombine- command (by Mark Chatfield, available from Stata Journal) to see whether they really are all compatible. If they are, then you can use the code above, except in the -local filenames:...- command, replace .csv with .dta. If they're not, then you can work on cleaning up the discrepancies that -precombine- tells you about, and then combine them.

      Note: if you don't take the advice in this paragraph and at some point -append- gives you an error message about something incompatible, do not deceive yourself into "solving" the problem by using -append-'s -force- option. That does not fix the problem. It simply suppresses the error message and throws away the offending data. So you end up with an incomplete, and incorrect, data set in the end. Don't do that.

      Added: Crossed with #2.

      Comment

      Working...
      X