Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • combining multiple excel file with multiples excel sheets

    Hi,
    I'm now trying to combine multiples excel sheets from multiple excel files. I want to get one excel file with respective sheets as combine. These excel files are in the same folder. The variables for each sheet of each excel file are the same.

    My code is below and the sample excels files and combined file at the attachment.

    I'm trying the below code, the result is not the way that I want, please suggest me.


    local xlsx : dir "$outputs\fb"files"*.xlsx"
    di `xlsx'



    import excel using "$outputs\fb\hfc_outputs_hvs_sup_22_1sep2020.xlsx" , desc
    return list
    local worksheets `r(N_worksheet)'
    display `worksheets'

    *set trace on

    forvalues i = 1 / `worksheets' {
    display " Sheet number `i'"
    local shtname`i' `r(worksheet_`i')'
    display "loading sheet: `r(worksheet_`i')'"

    tempfile ts`i'
    save `ts`i'', replace emptyok

    foreach f in `xlsx' {
    *set trace on

    import excel using "$outputs/fb/`f'", sheet("`shtname`i''") firstrow clear
    destring value, replace
    *set trace on
    display "file: `f' and sheet: `shtname`i''"


    append using `ts`i''
    save `ts`i'',replace


    }
    export excel using "$outputs/fb/com/combine_hfc_feedback.xlsx", sheet("`shtname`i''") sheetreplace firstrow(var)
    }

    Attached Files

  • #2
    Perhaps you can rework this. I tested this on my own simulated version of your problem. I didn't look at your attachments. My approach was to just list the excel files and go through each sheet turning them into a .dta file, then isolating the created .dta files and writing to an excel file in a different folder.

    You'll need filelist, which is from SCC and written by Robert Picard. I find filelist very useful for tasks such as these.

    Code:
    clear all
    
    // cd YOUR DIRECTORY WITH THE EXCEL FILES !!!
    
    filelist, pattern(*xlsx*)
    levelsof filename, local(excel_files)
    foreach x in `excel_files' {
        import excel using "`x'", desc
        forvalues z = 1/`r(N_worksheet)' {
            import excel using "`x'", sheet("`r(worksheet_`z')'") firstrow clear
            local name = strtoname("`r(worksheet_`z')'")
            save "`name'.dta", replace    
        }
    }
    
    clear
    filelist, pattern(*dta*)
    levelsof filename, local(stata_files)
    
    foreach f in `stata_files' {
        use "`f'", clear
        local fname = subinstr("`f'", ".dta", "", .)
        
        * YOU SHOULD WRITE YOUR COMBINED FILE TO A DIFFERENT FOLDER
        export excel using "../Output/Combined.xlsx", firstrow(varlabels) sheet("`fname'") sheetreplace
       erase "`f'"
    
    }
    
    exit
    Last edited by Justin Niakamal; 15 Oct 2020, 17:52. Reason: font

    Comment


    • #3
      Dear Justin Niakamal, the code you posted in #2 is very helpful, however I found that it cannot specify correct names of resulting .dta files. Could you point out where the problem lies for me? Thank you very much.

      Code:
      cd C:\Users\chen\Desktop\multiple
      
      sysuse auto
      export excel using auto.xlsx
      sysuse nlsw88, clear
      export excel using nlsw88.xlsx
      
      filelist, directory(C:\Users\chen\Desktop\multiple) pattern(*xlsx*)
      levelsof filename, local(excel_files)
      foreach x in `excel_files' {
          import excel using "`x'", desc
          forvalues z = 1/`r(N_worksheet)' {
              import excel using "`x'", sheet("`r(worksheet_`z')'") clear
              local name = strtoname("`r(worksheet_`z')'")
              save "`name'.dta", replace    
          }
      }
      Code:
      . do "C:\Users\chen\AppData\Local\Temp\STD3e2c_000000.tmp"
      
      . filelist, directory(C:\Users\chen\Desktop\multiple) pattern(*xlsx*)
      Number of files found = 2
      
      . levelsof filename, local(excel_files)
      `"auto.xlsx"' `"nlsw88.xlsx"'
      
      . foreach x in `excel_files' {
        2.     import excel using "`x'", desc
        3.     forvalues z = 1/`r(N_worksheet)' {
        4.         import excel using "`x'", sheet("`r(worksheet_`z')'") clear
        5.         local name = strtoname("`r(worksheet_`z')'")
        6.         save "`name'.dta", replace    
        7.     }
        8. }
      
           Sheet | Range
        ---------+---------
          Sheet1 | A1:L74
      (12 vars, 74 obs)
      (note: file .dta not found)
      file .dta saved
      
           Sheet | Range
        ---------+---------
          Sheet1 | A1:Q2246
      (17 vars, 2,246 obs)
      file .dta saved
      
      . 
      end of do-file

      Comment


      • #4
        It seems that there are two lines of code that should be exchanged order (Adding: the following modification is still wrong):

        Code:
        foreach x in `excel_files' {
            import excel using "`x'", desc
            forvalues z = 1/`r(N_worksheet)' {
                local name = strtoname("`r(worksheet_`z')'") //put this line before the next line
                import excel using "`x'", sheet("`r(worksheet_`z')'") clear
                save "`name'.dta", replace    
            }
        }
        Last edited by Chen Samulsion; 08 Nov 2021, 07:13. Reason: Codes still have problems

        Comment


        • #5
          Hi Chen,

          Try the following

          Code:
          clear all 
          version 16.1 
          
          sysuse auto, clear 
          export excel using auto.xlsx, sheet("Sheet1") sheetreplace 
          export excel using auto.xlsx, sheet("Sheet2") sheetreplace 
          
          sysuse nlsw88, clear
          export excel using nlsw88.xlsx, sheet("Sheet1") sheetreplace
          export excel using nlsw88.xlsx, sheet("Sheet2") sheetreplace
          
          filelist, directory("/Users/justinniakamal/Desktop/TEST") pattern("*.xlsx") norec
          
          levelsof filename, local(excel_files)
          
          foreach x in `excel_files' {
              import excel using "`x'", desc
              forvalues z = 1/`r(N_worksheet)' {
                  import excel using "`x'", sheet("`r(worksheet_`z')'") clear
                  local name = "`=strtoname("`x'")+ "_" + strtoname("`r(worksheet_`z')'")'"
                  save "`name'.dta", replace    
              }
          }
          
          filelist, directory("/Users/justinniakamal/Desktop/TEST") pattern("*.dta") norec
          
          list
          
               +-----------------------------------------------------------------------+
               | dirname                              filename                   fsize |
               |-----------------------------------------------------------------------|
            1. | /Users/justinniakamal/Desktop/TEST   auto_xlsx_Sheet1.dta      12,157 |
            2. | /Users/justinniakamal/Desktop/TEST   auto_xlsx_Sheet2.dta      12,157 |
            3. | /Users/justinniakamal/Desktop/TEST   nlsw88_xlsx_Sheet1.dta   280,955 |
            4. | /Users/justinniakamal/Desktop/TEST   nlsw88_xlsx_Sheet2.dta   280,955 |
               +-----------------------------------------------------------------------+

          Comment


          • #6
            Thank you very much Justin Niakamal. The new code still runs unsuccessfully. I use Stata 16 now, so maybe this caused the problem?

            Code:
            . clear all 
            
            . 
            . cd C:/Users/chen/Desktop/TEST
            C:\Users\chen\Desktop\TEST
            
            . 
            . sysuse auto, clear 
            (1978 Automobile Data)
            
            . export excel using auto.xlsx, sheet("Sheet1") sheetreplace 
            file auto.xlsx saved
            
            . export excel using auto.xlsx, sheet("Sheet2") sheetreplace 
            file auto.xlsx saved
            
            . 
            . sysuse nlsw88, clear
            (NLSW, 1988 extract)
            
            . export excel using nlsw88.xlsx, sheet("Sheet1") sheetreplace
            file nlsw88.xlsx saved
            
            . export excel using nlsw88.xlsx, sheet("Sheet2") sheetreplace
            file nlsw88.xlsx saved
            
            . 
            . filelist, directory("C:/Users/chen/Desktop/TEST") pattern("*.xlsx") norec
            Number of files found = 2
            
            . 
            . levelsof filename, local(excel_files)
            `"auto.xlsx"' `"nlsw88.xlsx"'
            
            . 
            . foreach x in `excel_files' {
              2.     import excel using "`x'", desc
              3.     forvalues z = 1/`r(N_worksheet)' {
              4.         import excel using "`x'", sheet("`r(worksheet_`z')'") clear
              5.         local name = "`=strtoname("`x'")+ "_" + strtoname("`r(worksheet_`z')'")'"
              6.         save "`name'.dta", replace    
              7.     }
              8. }
            
                 Sheet | Range
              ---------+---------
                Sheet1 | A1:L74
                Sheet2 | A1:L74
            (12 vars, 74 obs)
            (note: file auto_xlsx_.dta not found)
            file auto_xlsx_.dta saved
            (12 vars, 74 obs)
            file auto_xlsx_.dta saved
            
                 Sheet | Range
              ---------+---------
                Sheet1 | A1:Q2246
                Sheet2 | A1:Q2246
            (17 vars, 2,246 obs)
            (note: file nlsw88_xlsx_.dta not found)
            file nlsw88_xlsx_.dta saved
            (17 vars, 2,246 obs)
            file nlsw88_xlsx_.dta saved
            
            . 
            . filelist, directory("C:/Users/chen/Desktop/TEST") pattern("*.dta") norec
            Number of files found = 2
            
            . list
            
                 +---------------------------------------------------------+
                 | dirname                      filename             fsize |
                 |---------------------------------------------------------|
              1. | C:/Users/chen/Desktop/TEST   auto_xlsx_.dta      12,157 |
              2. | C:/Users/chen/Desktop/TEST   nlsw88_xlsx_.dta   280,955 |
                 +---------------------------------------------------------+

            Comment


            • #7
              And here are informations that reported by set trace on:

              Code:
              - forvalues z = 1/`r(N_worksheet)' {
              = forvalues z = 1/2 {
              - import excel using "`x'", sheet("`r(worksheet_`z')'") clear
              = import excel using "auto.xlsx", sheet("Sheet1") clear
              
              .......................................................................................
              
              - local name = "`=strtoname("`x'")+ "_" + strtoname("`r(worksheet_`z')'")'"
              = local name = "auto_xlsx_"
              - save "`name'.dta", replace
              = save "auto_xlsx_.dta", replace
              file auto_xlsx_.dta saved
              - }
              - import excel using "`x'", sheet("`r(worksheet_`z')'") clear
              = import excel using "auto.xlsx", sheet("") clear
              
              .......................................................................................
              
              - local name = "`=strtoname("`x'")+ "_" + strtoname("`r(worksheet_`z')'")'"
              = local name = "auto_xlsx_"
              - save "`name'.dta", replace
              = save "auto_xlsx_.dta", replace
              file auto_xlsx_.dta saved
              - }
              - }
              - import excel using "`x'", desc
              = import excel using "nlsw88.xlsx", desc
              Last edited by Chen Samulsion; 08 Nov 2021, 08:57.

              Comment


              • #8
                Also, see xls2dta from SSC.

                Comment


                • #9
                  Dear daniel klein, thank you very much. I use xls2dta (basic syntax and append syntax) and it gives me desirable result! But I am still puzzled about what's wrong with above codes in 3#, 6#.

                  Comment


                  • #10
                    Chen Samulsion , after running the code of #6, have the worksheets of auto.xlsx and nlsw88.xlsx been properly named as "Sheet 1" or "Sheet 2"? If yes, I cannot imagine why the extraction of worksheet names would fail.

                    Comment


                    • #11
                      I don't know. I reported results in 6# & 7#, and I use Stata 16.
                      What I should get are 4 .dta files: auto_xlsx_Sheet1.dta, auto_xlsx_Sheet2.dta, nlsw88_xlsx_Sheet1.dta, nlsw88_xlsx_Sheet2.dta
                      What I actually get are 2 (misnamed) .dta files: auto_xlsx_.dta, nlsw88_xlsx_.dta

                      Comment


                      • #12
                        Edit: My diagnosis was completely wrong. Well, except for

                        In #6, [...]

                        Code:
                        local name = "`=strtoname("`x'")+ "_" + strtoname("`r(worksheet_`z')'")'"
                        could be simplified to

                        Code:
                        local name = strtoname("`x'") + "_" + strtoname("`rowksheet_`z''")

                        Other than that, I have no idea what is going on. The code works for me without any modifications

                        Code:
                        . filelist, directory("E:/adodev") pattern("*.dta") norec
                        Number of files found = 4
                        
                        . list
                        
                             +----------------------------------------------+
                             | dirname     filename                   fsize |
                             |----------------------------------------------|
                          1. | [...]       auto_xlsx_Sheet1.dta      12,157 |
                          2. | [...]       auto_xlsx_Sheet2.dta      12,157 |
                          3. | [...]       nlsw88_xlsx_Sheet1.dta   280,955 |
                          4. | [...]       nlsw88_xlsx_Sheet2.dta   280,955 |
                             +----------------------------------------------+
                        Last edited by daniel klein; 08 Nov 2021, 23:00.

                        Comment


                        • #13
                          Problems of Stata version? I report some results here:

                          Stata 15:
                          Code:
                          . do "delete for anonymous reason\AppData\Local\Temp\STD20ec_000000.tmp"
                          
                          . cd delete for anonymous reason/TEST
                          delete for anonymous reason\TEST
                          
                          . 
                          . filelist, directory("delete for anonymous reason/TEST") pattern("*.xlsx") norec
                          Number of files found = 2
                          
                          . 
                          . levelsof filename, local(excel_files)
                          `"auto.xlsx"' `"nlsw88.xlsx"'
                          
                          . 
                          . foreach x in `excel_files' {
                            2.     import excel using "`x'", desc
                            3.     forvalues z = 1/`r(N_worksheet)' {
                            4.         import excel using "`x'", sheet("`r(worksheet_`z')'") clear
                            5.         local name = "`=strtoname("`x'")+ "_" + strtoname("`r(worksheet_`z')'")'"
                            6.         save "`name'.dta", replace    
                            7.     }
                            8. }
                          
                               Sheet | Range
                            ---------+---------
                              Sheet1 | A1:L74
                              Sheet2 | A1:L74
                          (note: file auto_xlsx_Sheet1.dta not found)
                          file auto_xlsx_Sheet1.dta saved
                          (note: file auto_xlsx_Sheet2.dta not found)
                          file auto_xlsx_Sheet2.dta saved
                          
                               Sheet | Range
                            ---------+---------
                              Sheet1 | A1:Q2246
                              Sheet2 | A1:Q2246
                          (note: file nlsw88_xlsx_Sheet1.dta not found)
                          file nlsw88_xlsx_Sheet1.dta saved
                          (note: file nlsw88_xlsx_Sheet2.dta not found)
                          file nlsw88_xlsx_Sheet2.dta saved
                          
                          . 
                          . filelist, directory("delete for anonymous reason/TEST") pattern("*.dta") norec
                          Number of files found = 4
                          
                          . 
                          . list
                          
                               +-------------------------------------------------------------+
                               | dirname                    filename                   fsize |
                               |-------------------------------------------------------------|
                            1. | .../TEST                   auto_xlsx_Sheet1.dta      12,157 |
                            2. | .../TEST                   auto_xlsx_Sheet2.dta      12,157 |
                            3. | .../TEST                   nlsw88_xlsx_Sheet1.dta   280,955 |
                            4. | .../TEST                   nlsw88_xlsx_Sheet2.dta   280,955 |
                               +-------------------------------------------------------------+
                          
                          . 
                          end of do-file
                          Stata 16:
                          Code:
                          . do "delete for anonymous reason\AppData\Local\Temp\STD2b50_000000.tmp"
                          
                          . cd delete for anonymous reason/TEST
                          delete for anonymous reason\TEST
                          
                          . 
                          . filelist, directory("delete for anonymous reason/TEST") pattern("*.xlsx") norec
                          Number of files found = 2
                          
                          . 
                          . levelsof filename, local(excel_files)
                          `"auto.xlsx"' `"nlsw88.xlsx"'
                          
                          . 
                          . foreach x in `excel_files' {
                            2.     import excel using "`x'", desc
                            3.     forvalues z = 1/`r(N_worksheet)' {
                            4.         import excel using "`x'", sheet("`r(worksheet_`z')'") clear
                            5.         local name = "`=strtoname("`x'")+ "_" + strtoname("`r(worksheet_`z')'")'"
                            6.         save "`name'.dta", replace    
                            7.     }
                            8. }
                          
                               Sheet | Range
                            ---------+---------
                              Sheet1 | A1:L74
                              Sheet2 | A1:L74
                          (12 vars, 74 obs)
                          (note: file auto_xlsx_.dta not found)
                          file auto_xlsx_.dta saved
                          (12 vars, 74 obs)
                          file auto_xlsx_.dta saved
                          
                               Sheet | Range
                            ---------+---------
                              Sheet1 | A1:Q2246
                              Sheet2 | A1:Q2246
                          (17 vars, 2,246 obs)
                          (note: file nlsw88_xlsx_.dta not found)
                          file nlsw88_xlsx_.dta saved
                          (17 vars, 2,246 obs)
                          file nlsw88_xlsx_.dta saved
                          
                          . 
                          . filelist, directory("delete for anonymous reason/TEST") pattern("*.dta") norec
                          Number of files found = 2
                          
                          . 
                          . list
                          
                               +-------------------------------------------------------+
                               | dirname                    filename             fsize |
                               |-------------------------------------------------------|
                            1. | .../TEST                   auto_xlsx_.dta      12,157 |
                            2. | .../TEST                   nlsw88_xlsx_.dta   280,955 |
                               +-------------------------------------------------------+
                          
                          . 
                          end of do-file
                          
                          
                          .

                          Comment


                          • #14
                            Make sure your Stata 16 is up to date.

                            Code:
                            . update query
                            (contacting http://www.stata.com)
                            
                            Update status
                                Last check for updates:  09 Nov 2021
                                New update available:    none         (as of 09 Nov 2021)
                                Current update level:    02 Nov 2021  (what's new)
                            
                            Possible actions
                            
                                Do nothing; all files are up to date.

                            From

                            Code:
                            help whatsnwe16
                            -------- update 30sep2019 --------
                            [...]
                            7. From the release of Stata 16, import excel filename unintentionally returned two undocumented results, r(N) and r(k). Thus, all r() results returned by import excel filename, describe were removed. Now import excel filename returns nothing, preserving the r() results from import excel filename, describe.

                            Comment


                            • #15
                              #14. I know the reasons, thank you very much daniel!

                              Comment

                              Working...
                              X