Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • foreach loop problems with import excel sheet name

    Hello. When I run the first bit of code below (Stata/MP 15.1), the data imports properly:
    Code:
    import excel "C:\Users\Doug Hess\Downloads\dhs_nvra1.xlsx", sheet("January 2018") firstrow
    However, when I run the code below I get the statement "January not found r(111)."
    Code:
    foreach month in January February March April May June July August September October November December {
            clear
            import excel "C:\Users\Doug Hess\Downloads\dhs_nvra.xlsx", sheet("`month' 2018") firstrow
            gen year=2018
            gen month=`month'
    save "D:\Data\State Data\Iowa\Iowa DHS `month' 2018.dta", replace
            }
    Moreover, if I drop January in the list, I get a different statement "worksheet February 2018 not found r(601)."

    Why do the error statements differ and, in any case, what can I do to fix this loop? Thanks for any suggestions. I fear I'm missing something obvious.

  • #2
    I couldn't reproduce your problem using Stata 16.1 or newer (I no longer have access to Stata 15). In any case, I would not expect the code as written to fail, so it makes me think there are subtleties about the names of the sheets themselves (perhaps Unicode characters) that are causing the problem.

    You may consider this example to examine the characters across all sheet names in your Excel file. Note the use of -chartab- (SSC, by Robert Picard).

    Code:
    mata:
      B = xl()
      B.load_book("test.xlsx")
      B.get_sheets()
      sheetnames = B.get_sheets()
      B.close_book()
    end
    
    getmata sheetnames, replace
    list
    
    chartab sheetnames

    Comment


    • #3
      Where does the error occur at? When I try
      Code:
      
      loc placeholder a 
      
      
      
      import excel "E:\Class\Prog Eval\Presentation\New Microsoft Excel Worksheet.xlsx", sheet("`placeholder' test sheet") clear
      
      
      br
      the data at least imports... To get around issues of names, perhaps this code would help if the data did not import correctly?

      Comment


      • #4
        I agree with the previous two responses that the cause of your problem is obscure. It looks like it should work.

        That said, I suggest you try a different approach. Let Stata itself determine the names of the worksheets in the Excel file:

        Code:
        import excel using
        "C:\Users\Doug Hess\Downloads\dhs_nvra.xlsx", describe
        local n_sheets = r(N_worksheet)
        
        forvalues i = 1/`n_sheets' {
            local sheet`i' `r(worksheet_`i')
        }
        
        forvalues i = 1/`n_sheets' {
            import excel using "C:\Users\Doug Hess\Downloads\dhs_nvra.xlsx", clear sheet(`"`sheet`i''"') firstrow
            gen year = 2018
            gen month = subinstr(`"`sheet`i''"', " 2018", "", .)
            save `"D:\Data\State Data\Iowa\Iowa DHS `sheet`i''.dta"', replace 
        }
        The drawback to this approach is that if the Excel worksheet names are, in fact, contaminated with non-printing characters or padded with blanks that are not visually apparent, the resulting Stata data set names will have the same problem.

        That said, the error messages you are getting from the two tries you report in #1 are not the same. The first one, in fact, has nothing to do with the -import excel- command. -gen month = `month'- is a syntax error because macro expansion changes this to -gen month = January- on the first iteration. And that is a syntax error because you have no variable named January in the data set. To make Stata understand that you want to create a new "variable" (actually it's a constant) named month containing the string "January" you have to write -gen month = "`month'"-. The "" wrapping of `month' is obligatory.

        The second error, however, is explicitly about not finding that worksheet, so something relating to the -import excel- process is going wrong.

        This suggests to me that whatever the obscure problem (probably something relating to the names of the worksheets in the Excel file) with the -import-ing, it does not affect the January worksheet--because you would have gotten the "worksheet ... not found r(601)" message and never reached the -gen month = January- command that threw the "January not found r(111)" message.

        Comment


        • #5
          Thanks for all the help. There were stray spaces in some tab names and the month of February was spelled wrong! Gremlins in state government.

          Comment

          Working...
          X