Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Syntax - looping to import multiple sheets of an .xlsx file ? (Stata 13)

    Hi all,

    I am really quite new to Stata, so thanks in advance for the help. I am trying to loop through an excel file that has 8 sheets, so I can modify it. Most stuff within the loop works fine, but the loop itself is not working. Here's the code for it below. I run this and the error says "unrecognized command: invalid command name ". (I am eventually going to try to nest this loop within another that goes through all files in a directory)

    Code:
    local sheets = `" "ag1" "ag2" "ag3" "ag4" "ag5" "ag6" "ag7" "ag8” "'
    foreach x of local sheets{
      import excel using /Users/andrea/Desktop/testapple.xlsx, sheet("`x'") firstrow
    
      //rename (C D E F G) y#, addnumber
      //rename y# ag("`x'")#, renumber(1995) sort
      //reshape long ag("`x'"), i(Cd) j(year)
      //gen crop = “apple” 
    
      save testapple_`x’.dta
    }

    I have also tried the below code snippet, which I took from this posting: https://www.statalist.org/forums/for...heets-to-stata.
    The error message is "option sheet() incorrectly specified" which is odd because I have it the same way as on the linked code.

    Code:
    foreach x in `" "ag1" "ag2" "ag3" "ag4" "ag5" "ag6" "ag7" "ag8" "'  {
      import excel using /Users/andrea/Desktop/testapple.xlsx, sheet("`x'") firstrow
      save file_`x’.dta
    }
    I was looking into the SSC xls2dta capabilities as well, since "allsheets" command will let me import and save all sheets of an excel file as .dta as once.. but i am under the impression that you are unable to modify the excel file before sending it off as .dta. Perhaps someone experienced could confirm or deny this.

    Thanks again for the help.

  • #2
    I can't explain your first problem, nor can I replicate it on my own setup when I replace the spreadsheet names and sheet names by files that exist in my directory. It works just fine for me.

    But your second problem is easily explained. When you write
    Code:
    foreach x in `" "ag1" "ag2" "ag3" "ag4" "ag5" "ag6" "ag7" "ag8" "' {
    `" "ag1" "ag2" "ag3" "ag4" "ag5" "ag6" "ag7" "ag8" "' is not recognized as 8 separate tokens to be iterated over. Instead it is treated as a single token, so your loop actually only executes once, and in that one run through, it tries to find a sheet called ""ag1" "ag2" "ag3" "ag4" "ag5" "ag6" "ag7" "ag8" "', which, needless to say, does not exist and is not a valid specification of the sheet() option..

    So, you can remove the outermost `" and "' from that construction, and then it will loop over the 8 worksheets, or you can go back to the -foreach x of local sheets- version, where sheets is the local macro defined in the first part of your post.

    Added: Let me add that, in general, when working with local macros in Stata, use quotation marks only when they are absolutely needed, which is to assure that a group of characters with embedded blanks is treated as a single token. Similarly avoid wrapping the entire macro in quotes (or extended quotes `"..."'). In some contexts it does no harm, but often the use of unnecessary quotes will result in what you intend to be a group of separate tokens being treated as a single token. The safer practice is to just avoid quotes in local macros except where they are absolutely necessary. Also, it is not necessary to use the = operator to define a local macro unless you want to evaluate the expression that follows and store the evaluated result in the local macro. When the macro you are defining is just a list of string tokens, that is unnecessary, and it can lead to confusion if you do it. Save that for things like -local whatever = 2+2- when you want the contents of whatever to be "4" and not "2+2".

    So,for example, the safer way to define local sheets would be just:

    Code:
    local sheets ag1 ag2 ag3 ag4 ag5 ag6 ag7 ag8
    Last edited by Clyde Schechter; 22 Jun 2017, 21:33.

    Comment


    • #3
      Hi, thanks for the quick and thorough reply! The stuff about macros is also extremely helpful to know.

      I implemented the new formatting and it seems to be working (i.e.- stata is recognizing that it is a loop) but now have another baffling error which is "
      worksheet ag1 not found" even though that is definitely an existing sheet in the file, and the worksheets are all in order too. I googled and tried a couple of things (resaving the .xslx file, restarting excel) but to no avail. Have you come across this issue?

      The new code is
      Code:
      foreach x in ag1 ag2 ag3 ag4 ag5 ag6 ag7 ag8 {
        import excel using /Users/andrea/Desktop/testapple.xlsx, sheet(" `x' ") firstrow
        save file_`x’.dta
      }

      Comment


      • #4
        Nevermind that last question, turns out I just put extra spaces.

        Comment


        • #5
          Originally posted by Andrea Wang View Post
          I was looking into the SSC xls2dta capabilities as well, since "allsheets" command will let me import and save all sheets of an excel file as .dta as once.. but i am under the impression that you are unable to modify the excel file before sending it off as .dta. Perhaps someone experienced could confirm or deny this.
          This is true. You cannot modify the sheets "on the fly". You can, however, execute commands on previously converted sheets. This could look like

          Code:
          xls2dta , allsheets : import excel using /Users/andrea/Desktop/testapple.xlsx, firstrow
          xls2dta : xeq ​rename (C D E F G) y#, addnumber ; rename y# ag("`x'")#, renumber(1995) sort ; reshape long ag("`x'"), i(Cd) j(year) ; gen crop = “apple”
          You can also put your commands into a do-file, say mydofile.do, and then

          Code:
          xls2dta , allsheets : import excel using /Users/andrea/Desktop/testapple.xlsx, firstrow
          xls2dta : do mydofile
          Best
          Daniel

          Comment


          • #6
            Hi Daniel,
            Ive been trying to use xls2dta but when trying to add firstrow as an option it doesnt work. It will work without it but not when I add firstrow. Any suggestions for this?

            This works to create multiple datasets


            xls2dta, allsheets : import excel using "my_data.xls"

            but when adding firstrow it does not

            xls2dta, allsheets : import excel using "my_data.xls", firstrow

            and I get r(198) as an error.

            any help on this? thanks a lot

            -Benjamin

            Comment


            • #7
              Benjamin

              I cannot replicate your problem.

              Code:
              // create Excel file with two sheets for demonstration
              sysuse auto , clear
              export excel price mpg using "my_data.xls" , sheet("Sheet_A") firstrow(variables)
              export excel weight turn using "my_data.xls" , sheet("Sheet_B") firstrow(variables)
              
              // convert all sheets in the Excel file, using the firstrow option
              xls2dta, allsheets : import excel using "my_data.xls", firstrow
              runs fine for me (Win 7; Stata 12.1 IC) and returns

              Code:
              . // create Excel file with two sheets for demonstration
              . sysuse auto , clear
              (1978 Automobile Data)
              
              . export excel price mpg using "my_data.xls" , sheet("Sheet_A") firstrow(variables)
              file my_data.xls saved
              
              . export excel weight turn using "my_data.xls" , sheet("Sheet_B") firstrow(variables)
              file my_data.xls saved
              
              . 
              . // convert all sheets in the Excel file, using the firstrow option
              . xls2dta, allsheets : import excel using "my_data.xls", firstrow
              file e:\adodev\my_data_1.dta saved
              file e:\adodev\my_data_2.dta saved
              Could you copy and paste the syntax (and output) that you get using CODE delimters (as I have done). Also, what do you get when you type

              Code:
              which xls2dta
              which xls2dta_work.class
              Best
              Daniel

              Comment


              • #8
                Hey Daniel,

                Thanks for quick response.

                So when I run the first one, it works fine and produces 11 separate dta files from the sheets.
                Code:
                xls2dta, allsheets : import excel using "my_data.xls", clear
                If i add "firstrow" and change nothing else first it wont work because there are already .dta files with the same name.
                Code:
                xls2dta, allsheets : import excel using "my_data.xls", firstrow clear
                Code:
                file C:\.......\mydata.dta already exists
                So then I delete the dta files originally created and rerun it again with firstrow and get this
                Code:
                . xls2dta, allsheets : import excel using "my_data.xls", firstrow clear
                (note: option clear ignored)
                r(198);
                So this error pops up and then no dta files are produced

                Hopefully this is enough information, seems odd it wouldnt be working.

                Cheers,
                Ben

                Comment


                • #9
                  I am sorry, I cannot reproduce this. The following still works fine for me

                  Code:
                  // version of Stata running (no personal information required)
                  version
                  
                  // version of xls2dta
                  which xls2dta
                  which xls2dta_work.class
                  
                  // create Excel file with two sheets for demonstration
                  sysuse auto , clear
                  export excel price mpg using "my_data.xls" , sheet("Sheet_A") firstrow(variables)
                  export excel weight turn using "my_data.xls" , sheet("Sheet_B") firstrow(variables)
                  
                  // convert all sheets in the Excel file, using the firstrow option
                  xls2dta, allsheets : import excel using "my_data.xls", firstrow clear
                  Please copy this complete code and run it on your computer. Caution: The code creates my_data.xls, my_data_1.dta, and my_data_2.dta in the current working directory, so be sure not to overwrite any important files on your side.

                  For me, the output is

                  Code:
                  . // version of Stata running (no personal information required)
                  . version
                  version 15.0
                  
                  .
                  . // version of xls2dta
                  . which xls2dta
                  (path omitted)
                  *! version 2.4.0 01dec2016 daniel klein
                  
                  . which xls2dta_work.class
                  (path omitted)
                  *! version 1.3.0 01dec2016 daniel klein
                  
                  .
                  . // create Excel file with two sheets for demonstration
                  . sysuse auto , clear
                  (1978 Automobile Data)
                  
                  . export excel price mpg using "my_data.xls" , sheet("Sheet_A") firstrow(variables)
                  file my_data.xls saved
                  
                  . export excel weight turn using "my_data.xls" , sheet("Sheet_B") firstrow(variables)
                  file my_data.xls saved
                  
                  .
                  . // convert all sheets in the Excel file, using the firstrow option
                  . xls2dta, allsheets : import excel using "my_data.xls", firstrow clear
                  (note: option clear ignored)
                  file e:\adodev\my_data_1.dta saved
                  file e:\adodev\my_data_2.dta saved
                  
                  .
                  end of do-file
                  If you get something different, please show the output that you get. If you get the same results, then there might be something wrong with the Excel files or import excel. You could then add option verbose to xls2dta or set trace on to see where the error arises.


                  Added in edit:

                  It might also help (a lot) if you copy and paste what you have typed, exactly, and what Stata did in response, exactly. Note that you use the (probably fake name) my_data.xls in the xls2dta command but then show that the (probably also fake) error message refers to mydata.dta (underscore is gone). Also, it seems very odd that you (say you) receive return code r(198) but no error message at all. Details are likely to be important here.

                  Best
                  Daniel
                  Last edited by daniel klein; 02 Jan 2020, 03:45.

                  Comment


                  • #10
                    Okay thanks Daniel, yes I changed the names of the files for confidentiality reasons. However there is no error message that appears only the r(198). I will try again, with your code. I am using Stata 12.0 perhaps that could affect it?
                    Thanks again,
                    Ben

                    Comment


                    • #11
                      Please note this section in the FAQ Advice you are asked to read before posting.

                      https://www.statalist.org/forums/help#version

                      11. What should I say about the version of Stata I use?

                      The current version of Stata is 16. Please specify if you are using an earlier version; otherwise, the answer to your question may refer to commands or features unavailable to you. Moreover, as bug fixes and new features are issued frequently by StataCorp, make sure that you update your Stata before posting a query, as your problem may already have been solved.
                      xls2dta requires Stata 12.1, so should not even run at all with 12.0.

                      Comment


                      • #12
                        Adding to Nick's important observation, Ben should update his version 12.0 to version 12.1 It is free of charge to do so. If he is working on an administered computer and cannot do this himself, he should ask his system administrator to do that for him.

                        Comment


                        • #13
                          I should clarify that, technically, xls2dta works with Stata 12. I had originally written and tested the command using Stata 12.1 and stated that version in the help-files. When I later re-installed Stata 12 on a new machine, I have added support for version 12 (with only minimal testing, though).

                          The reported problem is unlikely to be caused by running under Stata 12.0 and the code that I have given runs just fine in 12.1.

                          Nevertheless, Ben should update to 12.1.

                          Best
                          Daniel

                          Comment

                          Working...
                          X