Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Inserting results into SQL Server table

    I am very new to STATA and would appreciate some help in modifying a file to insert data into SQL Server. We currently have a DO file that processes population data to develop estimates for each year to 2020. The DO file processes 2000 and 2010 data then works with an annual estimate file. Towards the end of the DO file it creates a DTA file for each year of created data and then exports that to an Excel file. I am looking to have the data inserted into a table in SQL Server using the same final column names created in the DTA file and carried into the Excel file.

    I have created an ODBC connection to the database and have this, ( odbc insert LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER, as("LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER") dsn("POP_ESTIMATES") table("POP_COUNTS") block ), as the insert statement. First I would appreciate a confirmation that I have the insert statement written correctly. More importantly I need to know where I would insert this in the DO file. Below is the code, from the end of the DO file, that is used to create the final DTA files that are then exported in to Excel. Any help in how I should change it to use the ODBC insert, instead of the Excel export, or the DTA file creation, would be a great help.

    Thanks,
    Chris


    //Imports a text crosswalk of log record numbers with associated mcds
    import delimited logrecno_mcd.csv, clear
    //Format crosswalk to enable merge
    drop if logrecno == .
    rename census_name name
    drop if name == "County subdivisions not defined"
    replace name = substr(name, 1, length(name)-5) if substr(name, -5, 5) == " town"
    replace name = substr(name, 1, length(name)-5) if substr(name, -5, 5) == " city"
    replace name = substr(name, 1, length(name)-5) if substr(name, -5, 5) == " Town"
    replace name = "Manchester" if substr(name, 1, 10) == "Manchester"
    expand 1638
    sort logrecno
    egen id = seq(), f(1) t(1638)
    merge m:1 id using Demographic_Codes_mcd
    keep logrecno name COLUMN_ID
    save logrecno_mcd, replace

    //Merge mcd data with mcd to logrecno crosswalk
    merge m:m name COLUMN_ID using mcd_`c'_`y'
    keep if _merge == 3

    //Reformat data and variable names to OUR specifications
    gen LOGRECNO = string(logrecno, "%07.0f")
    drop logrecno
    rename count_`y' CCR_ESTIMATE
    gen ADJ_ESTIMATE = CCR_ESTIMATE + (CCR_ESTIMATE * mpe)
    rename mpe MPE
    rename stdev STD_DEV
    rename cilower CI_LOWER
    rename ciupper CI_UPPER
    gen ACS_CENSUS_DATE = string(year) + "0401"

    //Keep only the data requested by OUR
    keep LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER count_2010
    order LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER
    sort ACS_CENSUS_DATE LOGRECNO COLUMN_ID

    save mcd_`c'_`y', replace
    clear
    }
    }

    //Create 2010 data in the same format as the 2011-2020 data
    foreach c in "Barnstable" "Berkshire" "Bristol" "Dukes" "Essex" "Franklin" "Hampden" ///
    "Hampshire" "Middlesex" "Nantucket" "Norfolk" "Plymouth" "Suffolk" "Worcester" {
    use mcd_`c'_2011, clear
    replace CCR_ESTIMATE = count_2010
    replace CI_LOWER = CCR_ESTIMATE
    replace CI_UPPER = CCR_ESTIMATE
    replace ADJ_ESTIMATE = CCR_ESTIMATE
    drop count_2010
    replace ACS_CENSUS_DATE = "20100401"
    foreach e in MPE STD_DEV {
    replace `e' = 0
    }
    save mcd_`c'_2010, replace
    }

    //Drop 2010 data in 2011-2020 datasets
    foreach c in "Barnstable" "Berkshire" "Bristol" "Dukes" "Essex" "Franklin" "Hampden" ///
    "Hampshire" "Middlesex" "Nantucket" "Norfolk" "Plymouth" "Suffolk" "Worcester" {
    foreach y in "2011" "2012" "2013" "2014" "2015" "2016" "2017" "2018" "2019" "2020" {
    use mcd_`c'_`y', clear
    drop count_2010
    save mcd_`c'_`y', replace
    }
    }

    foreach y in "2010" "2011" "2012" "2013" "2014" "2015" "2016" "2017" "2018" "2019" "2020" {
    use mcd_Barnstable_`y'
    save 00_10_20_mcd_`y', replace
    foreach c in "Berkshire" "Bristol" "Dukes" "Essex" "Franklin" "Hampden" ///
    "Hampshire" "Middlesex" "Nantucket" "Norfolk" "Plymouth" "Suffolk" "Worcester" {
    append using mcd_`c'_`y'
    save 00_10_20_mcd_`y', replace
    }
    export excel using "D:\PopEstimates\2016_pop_est_files_4_25_run\2016_ Output_Files\Communities\00_10_20_mcd_`y'.xlsx", firstrow(variables) replace
    }

  • #2
    Originally posted by Chris Willard View Post
    First I would appreciate a confirmation that I have the insert statement written correctly.
    I doubt that it does any harm absent a typographical error, but if you're not changing the column (variable) names on-the-fly, then you don't need the as() option in your odbc insert command. Otherwise it looks kosher.

    More importantly I need to know where I would insert this in the DO file.
    It seems to fit right where you have the export excel command. If there is some logical error, then the first clue is usually a constraint-violation complaint from the RDBMS. You can always try it and read back (odbc load) the inserted data and compare them to what you expect in order to verify correctness.

    Comment

    Working...
    X