Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Import data from google spreadsheet by Stata

    Hi ,
    I'm now trying to import data from google spreadsheets. How can I proceed?

  • #2
    Hi Thein,
    some time ago I started looking into how to import data from google spreadsheets. If the file is public, i.e. you have a shareable link, then you can use the copy command. Back then I coded a simple program for this:
    Code:
    capture program drop gsheet
    program define gsheet
        syntax anything , key(string) id(string)
        
        local url "https://docs.google.com/spreadsheets/d/`key'/export?gid=`id'&format=csv"
        
        copy "`url'" `anything', replace
        
        noi disp `"saved in `anything'"'
        
        import delim using `anything', clear
    end
    
    **** key comes from url of shareable link
    ** Example:
    ** https://docs.google.com/spreadsheets/d/18l-T2xk03s8FEgNdbEP9gONlU6CvUSl6SlKJJpGfFKU/edit?usp=sharing
    **** id is the spreadsheet id and can be at the end when in the edit mode in google sheets
    
    gsheet "`c(pwd)'\\test.csv" , key("18l-T2xk03s8FEgNdbEP9gONlU6CvUSl6SlKJJpGfFKU") id("0")
    gsheet "`c(pwd)'\\ipejd219\Google Drive\Research\StataCode\test2.csv" , key("18l-T2xk03s8FEgNdbEP9gONlU6CvUSl6SlKJJpGfFKU") id("1504070058")
    The example above is simple and works, but is a bit fiddly. It requires knowledge of the spreadsheet and it needs to be shareable. In theory it is possible to access google sheets which are private, however this would require more coding. You would need to authenticate yourself with the Google Sheets API and then access the sheets folder. This is done via Java. As far as I understood it, it is not too difficult and do-able.

    In general, if you are a Java coder, there would be lots of potential writing a more sophisticated routine. For example obtaining sheets by their name, changing contents, adding sheets or entire documents. I always wanted to have a look into it, but never found the time...

    Comment


    • #3
      Just found the link to the Google Spreadsheet API if someone is interest: https://developers.google.com/sheets...uides/concepts

      Comment


      • #4
        Code:
        tempfile sheetsFile
        copy "https://docs.google.com/spreadsheets/d/<spreadsheetKey>/export?format=xlsx" `sheetsFile', replace
        import excel `sheetsFile', clear firstrow
        For those interested, the code above downloads the spreadsheet as an xlsx file and imports it into Stata. Replace spreadsheetkey as per Jan Ditzen's post.

        Comment


        • #5
          Jesse Wursten JanDitzen

          Not sure if this warrants its own thread, but it’s closely related...

          This might be a bit of a long shot, but I’m curious if you know of a way to get a list of the files in a Google Drive folder using Stata if you have a sharable link to the folder? I’ve done this using Python, but wonder if getting a list of files from a Google Drive folder using Stata is within the realm of possibilities. I’ve tried a couple of things, but with no luck.

          Comment


          • #6
            I think you'd need to leverage the Google Drive API to do that. AFAIK Stata has no native way of accessing APIs, although you can call curl or powershell from Stata through the shell (see help shell). If you have working python code you could try to run that directly from Stata (see help python) and then send the output to Stata through the Python-Stata Interface (see https://www.stata.com/python/api16/).

            Comment


            • #7
              That's what I suspected. Thanks, Jesse.

              Comment


              • #8
                Can you mount the Google Drive folder locally?

                Comment


                • #9
                  I'm trying this approach and getting an error. This code comes from Jesse Wursten above:
                  HTML Code:
                  tempfile sheetsFile
                  copy "https://docs.google.com/spreadsheets/d/<spreadsheetkey>/edit?usp=sharing/export?format=xlsx" `sheetsFile', replace
                  import excel `sheetsFile', clear firstrow
                  where <spreadsheetkey> is the address of my spreadsheet.
                  The first two lines run without an error message, but the last line returns the following:
                  file C:/<path to my temp directory>/Temp/ST_d28_000001.tmp could not be loaded
                  The issue seems to be that the second line doesn't return a valid xlsx file. I can't open the result in Excel, either.
                  Last edited by paulvonhippel; 27 Jan 2021, 11:58.

                  Comment


                  • #10
                    Is the file you try to access shared for everyone with the link?

                    Comment


                    • #11
                      Found a way to work around the above:

                      Step1: Publish your spreadsheet as Excel to access all tabs
                      Go on your spreadsheet then click File >> Publish to the web >> Link >> Entire Document | Microsoft Excel(.xlsx) >> Publish
                      Step2: Create an Excel copy of the spreadsheet in memory.
                      Code:
                      tempfile gsheet
                      copy "published spreadsheet link from above" `gsheet', replace

                      Step3: Call the Excel copy created in memory.
                      Code:
                      import excel using `gsheet', first sheet("tabname") clear
                      Last edited by Narcisse Mutabaruka; 30 Sep 2021, 12:03.

                      Comment


                      • #12
                        I tested all the above commends but none of them work for importing google spreadsheet into STATA. Please guide how to import google spreadsheet into STATA.

                        Comment

                        Working...
                        X