I am very new to STATA and would appreciate some help in modifying a file to insert data into SQL Server. We currently have a DO file that processes population data to develop estimates for each year to 2020. The DO file processes 2000 and 2010 data then works with an annual estimate file. Towards the end of the DO file it creates a DTA file for each year of created data and then exports that to an Excel file. I am looking to have the data inserted into a table in SQL Server using the same final column names created in the DTA file and carried into the Excel file.
I have created an ODBC connection to the database and have this, ( odbc insert LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER, as("LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER") dsn("POP_ESTIMATES") table("POP_COUNTS") block ), as the insert statement. First I would appreciate a confirmation that I have the insert statement written correctly. More importantly I need to know where I would insert this in the DO file. Below is the code, from the end of the DO file, that is used to create the final DTA files that are then exported in to Excel. Any help in how I should change it to use the ODBC insert, instead of the Excel export, or the DTA file creation, would be a great help.
Thanks,
Chris
//Imports a text crosswalk of log record numbers with associated mcds
import delimited logrecno_mcd.csv, clear
//Format crosswalk to enable merge
drop if logrecno == .
rename census_name name
drop if name == "County subdivisions not defined"
replace name = substr(name, 1, length(name)-5) if substr(name, -5, 5) == " town"
replace name = substr(name, 1, length(name)-5) if substr(name, -5, 5) == " city"
replace name = substr(name, 1, length(name)-5) if substr(name, -5, 5) == " Town"
replace name = "Manchester" if substr(name, 1, 10) == "Manchester"
expand 1638
sort logrecno
egen id = seq(), f(1) t(1638)
merge m:1 id using Demographic_Codes_mcd
keep logrecno name COLUMN_ID
save logrecno_mcd, replace
//Merge mcd data with mcd to logrecno crosswalk
merge m:m name COLUMN_ID using mcd_`c'_`y'
keep if _merge == 3
//Reformat data and variable names to OUR specifications
gen LOGRECNO = string(logrecno, "%07.0f")
drop logrecno
rename count_`y' CCR_ESTIMATE
gen ADJ_ESTIMATE = CCR_ESTIMATE + (CCR_ESTIMATE * mpe)
rename mpe MPE
rename stdev STD_DEV
rename cilower CI_LOWER
rename ciupper CI_UPPER
gen ACS_CENSUS_DATE = string(year) + "0401"
//Keep only the data requested by OUR
keep LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER count_2010
order LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER
sort ACS_CENSUS_DATE LOGRECNO COLUMN_ID
save mcd_`c'_`y', replace
clear
}
}
//Create 2010 data in the same format as the 2011-2020 data
foreach c in "Barnstable" "Berkshire" "Bristol" "Dukes" "Essex" "Franklin" "Hampden" ///
"Hampshire" "Middlesex" "Nantucket" "Norfolk" "Plymouth" "Suffolk" "Worcester" {
use mcd_`c'_2011, clear
replace CCR_ESTIMATE = count_2010
replace CI_LOWER = CCR_ESTIMATE
replace CI_UPPER = CCR_ESTIMATE
replace ADJ_ESTIMATE = CCR_ESTIMATE
drop count_2010
replace ACS_CENSUS_DATE = "20100401"
foreach e in MPE STD_DEV {
replace `e' = 0
}
save mcd_`c'_2010, replace
}
//Drop 2010 data in 2011-2020 datasets
foreach c in "Barnstable" "Berkshire" "Bristol" "Dukes" "Essex" "Franklin" "Hampden" ///
"Hampshire" "Middlesex" "Nantucket" "Norfolk" "Plymouth" "Suffolk" "Worcester" {
foreach y in "2011" "2012" "2013" "2014" "2015" "2016" "2017" "2018" "2019" "2020" {
use mcd_`c'_`y', clear
drop count_2010
save mcd_`c'_`y', replace
}
}
foreach y in "2010" "2011" "2012" "2013" "2014" "2015" "2016" "2017" "2018" "2019" "2020" {
use mcd_Barnstable_`y'
save 00_10_20_mcd_`y', replace
foreach c in "Berkshire" "Bristol" "Dukes" "Essex" "Franklin" "Hampden" ///
"Hampshire" "Middlesex" "Nantucket" "Norfolk" "Plymouth" "Suffolk" "Worcester" {
append using mcd_`c'_`y'
save 00_10_20_mcd_`y', replace
}
export excel using "D:\PopEstimates\2016_pop_est_files_4_25_run\2016_ Output_Files\Communities\00_10_20_mcd_`y'.xlsx", firstrow(variables) replace
}
I have created an ODBC connection to the database and have this, ( odbc insert LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER, as("LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER") dsn("POP_ESTIMATES") table("POP_COUNTS") block ), as the insert statement. First I would appreciate a confirmation that I have the insert statement written correctly. More importantly I need to know where I would insert this in the DO file. Below is the code, from the end of the DO file, that is used to create the final DTA files that are then exported in to Excel. Any help in how I should change it to use the ODBC insert, instead of the Excel export, or the DTA file creation, would be a great help.
Thanks,
Chris
//Imports a text crosswalk of log record numbers with associated mcds
import delimited logrecno_mcd.csv, clear
//Format crosswalk to enable merge
drop if logrecno == .
rename census_name name
drop if name == "County subdivisions not defined"
replace name = substr(name, 1, length(name)-5) if substr(name, -5, 5) == " town"
replace name = substr(name, 1, length(name)-5) if substr(name, -5, 5) == " city"
replace name = substr(name, 1, length(name)-5) if substr(name, -5, 5) == " Town"
replace name = "Manchester" if substr(name, 1, 10) == "Manchester"
expand 1638
sort logrecno
egen id = seq(), f(1) t(1638)
merge m:1 id using Demographic_Codes_mcd
keep logrecno name COLUMN_ID
save logrecno_mcd, replace
//Merge mcd data with mcd to logrecno crosswalk
merge m:m name COLUMN_ID using mcd_`c'_`y'
keep if _merge == 3
//Reformat data and variable names to OUR specifications
gen LOGRECNO = string(logrecno, "%07.0f")
drop logrecno
rename count_`y' CCR_ESTIMATE
gen ADJ_ESTIMATE = CCR_ESTIMATE + (CCR_ESTIMATE * mpe)
rename mpe MPE
rename stdev STD_DEV
rename cilower CI_LOWER
rename ciupper CI_UPPER
gen ACS_CENSUS_DATE = string(year) + "0401"
//Keep only the data requested by OUR
keep LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER count_2010
order LOGRECNO COLUMN_ID MPE ACS_CENSUS_DATE CCR_ESTIMATE ADJ_ESTIMATE STD_DEV CI_LOWER CI_UPPER
sort ACS_CENSUS_DATE LOGRECNO COLUMN_ID
save mcd_`c'_`y', replace
clear
}
}
//Create 2010 data in the same format as the 2011-2020 data
foreach c in "Barnstable" "Berkshire" "Bristol" "Dukes" "Essex" "Franklin" "Hampden" ///
"Hampshire" "Middlesex" "Nantucket" "Norfolk" "Plymouth" "Suffolk" "Worcester" {
use mcd_`c'_2011, clear
replace CCR_ESTIMATE = count_2010
replace CI_LOWER = CCR_ESTIMATE
replace CI_UPPER = CCR_ESTIMATE
replace ADJ_ESTIMATE = CCR_ESTIMATE
drop count_2010
replace ACS_CENSUS_DATE = "20100401"
foreach e in MPE STD_DEV {
replace `e' = 0
}
save mcd_`c'_2010, replace
}
//Drop 2010 data in 2011-2020 datasets
foreach c in "Barnstable" "Berkshire" "Bristol" "Dukes" "Essex" "Franklin" "Hampden" ///
"Hampshire" "Middlesex" "Nantucket" "Norfolk" "Plymouth" "Suffolk" "Worcester" {
foreach y in "2011" "2012" "2013" "2014" "2015" "2016" "2017" "2018" "2019" "2020" {
use mcd_`c'_`y', clear
drop count_2010
save mcd_`c'_`y', replace
}
}
foreach y in "2010" "2011" "2012" "2013" "2014" "2015" "2016" "2017" "2018" "2019" "2020" {
use mcd_Barnstable_`y'
save 00_10_20_mcd_`y', replace
foreach c in "Berkshire" "Bristol" "Dukes" "Essex" "Franklin" "Hampden" ///
"Hampshire" "Middlesex" "Nantucket" "Norfolk" "Plymouth" "Suffolk" "Worcester" {
append using mcd_`c'_`y'
save 00_10_20_mcd_`y', replace
}
export excel using "D:\PopEstimates\2016_pop_est_files_4_25_run\2016_ Output_Files\Communities\00_10_20_mcd_`y'.xlsx", firstrow(variables) replace
}
Comment