Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem when trying to import excel file into Stata

    Hi All

    I get the following error message when trying to import an excel datasheet (.xlsx) into Stata 13:

    ''Unable to load excel file. The central directory was not found in the archive (or you were trying to open not the last volume of a segmented archive.)''

    I've never encountered the above error message before. I tried saving the excel file with different extensions as well, which did not help. Does anyone have any idea as to what could be going wrong?

    Thanks

    /Amal

  • #2
    Are you able to share the file and/or how it was created?

    Comment


    • #3
      Yes - no problems in sharing the file. The excel file was created by our database manager. It's password protected for security purposes. I disabled the password protection, thinking that perhaps this might be causing the problem but it hasn't helped.

      I know others have had the same problem and posted the same on Statalist but no one has been able to provide any real solutions.

      Thanks

      /Amal

      Comment


      • #4
        Do you know how the DB manager created the file? And/or are they able to provide you with the data in a plain text format (e.g., csv, tsv, etc...)? It sounds like it is a malformed file issue where some of the subdirectories of the .xlsx file aren't in the normal positions. Do you know if there are any other permissions set on the file (e.g., hiding cells, restrictions on cells/cell ranges, etc...)?

        Comment


        • #5
          If the immediate objective is to get the data into Stata, rather than understand the particular problem, then perhaps you could open the offending workbook, select and copy the entire range of data, open a new workbook, and Paste Special... Values and Number Formats into the worksheet in the new workbook, then save the new workbook and attempt to read it into Stata instead of the original.

          That's the most extreme measure I can think of. A less drastic measure is to open the offending workbook, right-click on the worksheet tab, choose Move or Copy..., select "(new book)" as the destination, and select "create a copy". If you can read that copy into Stata, it suggests the problem lies in the construction of the original workbook, rather than the spreadsheet itself.

          One thought that comes to mind, in addition to the good suggestions from wbuchanan: Did the DB manager create this as a worksheet with a link to the database to allow dynamic updates to refresh the data? If so, you'd probably see "Refresh Data" not grayed out on the Data menu. Perhaps something like that befuddles Stata. (I don't put much faith in that explanation, I'd be interested in what wbuchanan thinks.)

          Comment


          • #6
            William Lisowski I hadn't even considered that as a possibility although it seems to make sense. I've not had any experience with using those types of files, but imagine that could make things a bit more difficult to hunt down. It may not show anything, but when things get to be a bit of a pain I always default to trying to find things with:

            Code:
            set tracedepth 2
            set trace on
            I don't usually have to go into a depth much further than that to get what I need, but maybe it will help to show where things are breaking.

            Comment


            • #7
              This kind of issue is usually caused by the program created the file not following the OpenXML standard for the Excel file. Micorsoft Excel program can be pretty tolerant, more than the XML parser we are using.

              William Lisowski provided a good work around and it is what I would use to deal with this sort of problems. Since the error is rooted in the internal XML parsing code, -set trace on- will not help.

              If
              Amal Khanolkar can post the Excel file, we are interested to see if we can fix the problem in Stata 14.

              Comment


              • #8
                Hi All

                Thanks for your many and useful suggestions. I finally managed to import the data into excel. However, I wasn't able to solve the original problem - I asked for the data to be re-sent as .csv file without password protection. The original excel file was password protected, so wondering if this was the issue. However, the original excel file didn't open in Stata even after I disabled the password protection.

                I can not post the data file here due to security and data handling issues at my institute.

                Thanks!

                /Amal

                Comment


                • #9
                  Just wanted to verify that I had the same error message as described above when using Stata SE 13.1 and Excel V16.16.1 on a Mac running High Sierra. My dataset was not password protected and was an xlsx file. Copying and using Paste Special for values only (what I tried before I saw this thread) did not work. Saving as an Excel file to the Desktop did not work (the file had been in Dropbox). I copied to the Desktop and saved as CSV; this was able to be imported. I then tried the work-around listed above:

                  If the immediate objective is to get the data into Stata, rather than understand the particular problem, then perhaps you could open the offending workbook, select and copy the entire range of data, open a new workbook, and Paste Special... Values and Number Formats into the worksheet in the new workbook, then save the new workbook and attempt to read it into Stata instead of the original.

                  Which also worked, and I was able to then put the new file back into the original directory and import it from there without problems.

                  Hopefully this will be fixed at some point.

                  Edit: Thanks to those who figured it out!!

                  Michelle

                  Comment

                  Working...
                  X