Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xlCreateBook() error when trying to open 97-2003 xls files

    Hello Statalisters,

    I have received around 1700 97-2003 excel files with a ton of data including some text information. I am trying to read them all into Stata before I clean and run some statistical analysis. But, Stata is unable to open them. The files have a .xls format with large amounts of text in them which I want to keep and convert into .dta files.

    On closer look, every time I manually open a file in Excel I get the error "Format does not match extension", I have to click on "Open anyway" and manually save it before the file can be imported into Stata. I can't possibly do this for 1700 files. I am using a windows OS and all files are saved locally on my drive.
    I additionally tried to write a bulk line of code on stata to save all files as csv or at least an xls that can be imported into stata but I have had zero luck. Every time I try to save them, I get a corrupted .csv or a .xls with the exact same error that pops up as " try xlCreateBook()" on Stata and the "format does not match extension" error when I manually try to open the file.

    For reference, under is the stata code I wrote.
    Code:
      cd "$dropbox\project8\data\2008-2009\A.Monthwise\dakota" local oldfiles : dir "." files "*.xls" , respectcase mac list _oldfiles  foreach fn of local oldfiles {   local oldfn `"`fn'"'   local res = ustrregexm(`"`oldfn'"', "_(.*)\.xls", .)   if `res'==1 {     local newfn = "new_" + ustrregexs(1) + ".dta"     di `"Renaming `oldfn' to `newfn'"'     copy `"./`oldfn'"' `"./new/`newfn'"', public        } }
    Is there a way I can fix this extension/format issue for 1700 files and convert everything in bulk, such that Stata can read it?

  • #2
    The Stata code from post #1, with line breaks added back in.
    Code:
    cd "$dropbox\project8\data\2008-2009\A.Monthwise\dakota" 
    local oldfiles : dir "." files "*.xls" , respectcase 
    mac list _oldfiles  
    foreach fn of local oldfiles {
       local oldfn `"`fn'"'   
       local res = ustrregexm(`"`oldfn'"', "_(.*)\.xls", .)   
       if `res'==1 {     
           local newfn = "new_" + ustrregexs(1) + ".dta"     
           di `"Renaming `oldfn' to `newfn'"'     
           copy `"./`oldfn'"' `"./new/`newfn'"', public        
       } 
    }
    This suggests that you are essentially renaming files that end in xls to instead end in .dat. That does nothing to change the contents of the files from Excel xls format to Stata dta format.

    It seems to me that what you are asking is how to import into Stata an Excel file that Excel has described with an error message.

    I think the first step is to find out what the Excel error message is telling you, and what Excel does to overcome the problem.

    One possibility - I just tested this - is that in fact your Excel files are actually in CSV format, but somehow were given an XLS suffix. They symptoms are exactly what you describe when opening your files in Excel. Try opening one of them in Notepad or some other simple text editor and see if this is the case. If it is, you can probably import your data using the import delimited command.

    Comment

    Working...
    X