Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Downloading data from Public API from json format to csv file or txt file so that I can read data in stata on macOS

    I have to download the following data from Business Dynamics Statistics ( which is public data provided by census bureau ) from the following API which returns the data in json format.


    Does anyone have any idea how I can download the data in stata readable format from the following link ?

    Code:
    https://api.census.gov/data/timeseries/bds?get=NAME,YEAR,FIRM,ESTAB,EAGE,EMPSZES,EMPSZFI,ESTABS_ENTRY,ESTABS_EXIT,FAGE,FIRMDEATH_EMP,FIRMDEATH_ESTABS,FIRMDEATH_FIRMS,JOB_CREATION,JOB_CREATION_BIRTHS,JOB_CREATION_RATE,JOB_DESTRUCTION,JOB_DESTRUCTION_DEATHS,NET_JOB_CREATION,REALLOCATION_RATE,NET_JOB_CREATION_RATE,EMP&for=county:*&in=state:*&time=from+2000+to+2008&for=NAICS:*

  • #2
    As far as I can tell from the documentation, there is not a way to modify the link to download the data as a .csv (or anything but json really). My typical workflow with the census API is to download the data with a python script and convert to a .csv from there with the json and csv packages - although it looks like this might be even easier with a few lines of pandas.

    In Stata, you might try the user contributed -insheetjson- command from ssc.

    Comment


    • #3
      can you kindly tell me what python command I can use so that I can download in csv format. Honestly, I'd be able to use python on my computer too if I get the command. Unfortunately, I am getting lost finding the command appropriate for the above API link. If you kindly guide me a little from your experience

      Comment


      • #4
        Again, I think the -insheetjson- command will work. Typically, I prefer to only post Stata native solutions here, but since you asked directly:

        Code:
        import requests
        import json
        import csv
        
        URL = "https://api.census.gov/data/timeseries/bds?get=NAME,YEAR,FIRM,ESTAB,EAGE,EMPSZES,EMPSZFI,ESTABS_ENTRY,ESTABS_EXIT,FAGE,FIRMDEATH_EMP,FIRMDEATH_ESTABS,FIRMDEATH_FIRMS,JOB_CREATION,JOB_CREATION_BIRTHS,JOB_CREATION_RATE,JOB_DESTRUCTION,JOB_DESTRUCTION_DEATHS,NET_JOB_CREATION,REALLOCATION_RATE,NET_JOB_CREATION_RATE,EMP&for=county:*&in=state:*&time=from+2000+to+2008&for=NAICS:*"
        
        response = requests.get(URL)
        json_string = response.content
        json_data = json.loads(json_string)
        with open("result.csv", "w") as f:
            writer = csv.writer(f)
            writer.writerows(json_data)
        Edit: managed to break the URL while editing this post. This should now have the correct URL.
        Last edited by Daniel Schaefer; 28 Dec 2022, 21:53.

        Comment


        • #5
          I afraid I can't get insheet json to work after all. Trying to load from the URL:

          Code:
          insheetjson using "https://api.census.gov/data/timeseries/bds?get=NAME,YEAR,FIRM,ESTAB,EAGE,EMPSZES,EMPSZFI,ESTABS_ENTRY,ESTABS_EXIT,FAGE,FIRMDEATH_EMP,FIRMDEATH_ESTABS,FIRMDEATH_FIRMS,JOB_CREATION,JOB_CREATION_BIRTHS,JOB_CREATION_RATE,JOB_DESTRUCTION,JOB_DESTRUCTION_DEATHS,NET_JOB_CREATION,REALLOCATION_RATE,NET_JOB_CREATION_RATE,EMP&for=county:*&in=state:*&time=from+2000+to+2008&for=NAICS:*"
          gives this error:

          Error -691, Unable to open URL: I/O error

          and when I try to load from the file system:

          Code:
          insheetjson using "raw_json.json"
          I get this error:

          Unable to open URL: I/O request refused by operating system

          EDIT: The little python script in #4 works though. I've got the data on my file system.
          Last edited by Daniel Schaefer; 28 Dec 2022, 22:31.

          Comment


          • #6
            Originally posted by Daniel Schaefer View Post
            Again, I think the -insheetjson- command will work. Typically, I prefer to only post Stata native solutions here, but since you asked directly:

            Code:
            import requests
            import json
            import csv
            
            URL = "https://api.census.gov/data/timeseries/bds?get=NAME,YEAR,FIRM,ESTAB,EAGE,EMPSZES,EMPSZFI,ESTABS_ENTRY,ESTABS_EXIT,FAGE,FIRMDEATH_EMP,FIRMDEATH_ESTABS,FIRMDEATH_FIRMS,JOB_CREATION,JOB_CREATION_BIRTHS,JOB_CREATION_RATE,JOB_DESTRUCTION,JOB_DESTRUCTION_DEATHS,NET_JOB_CREATION,REALLOCATION_RATE,NET_JOB_CREATION_RATE,EMP&for=county:*&in=state:*&time=from+2000+to+2008&for=NAICS:*"
            
            response = requests.get(URL)
            json_string = response.content
            json_data = json.loads(json_string)
            with open("result.csv", "w") as f:
            writer = csv.writer(f)
            writer.writerows(json_data)
            Edit: managed to break the URL while editing this post. This should now have the correct URL.
            super helpful and now I can relate ! Thanks so much ! grateful to youfor patently guiding me through this process

            Comment

            Working...
            X