I have an active data collection anticipated to last around 50 days covering 4800 respondents. My team of 60 enumerators collects data using SurveyCTO and Psych App.
While SurveyCTO records is accessed as one data file (.xlsx, .csv), the Psych App submits 3 separate .txt files.
In addition, accurately measure self control through this app, I need each respondent to play the Psych App thrice.
In the end I need to combine 4800*3*3 = 43,200 individual .txt files to form one dataset.
Further complication, data is submitted daily, so I create daily folder and within each folder I create subfolder for each enumerator.
data:image/s3,"s3://crabby-images/f593e/f593e207bd1dedd899360b098f492b98ea6d8eb5" alt="Click image for larger version
Name: Daily Folders.JPG
Views: 1
Size: 33.5 KB
ID: 1650711"
Within each folder, I have these subfolders (1 to 60)
data:image/s3,"s3://crabby-images/7d0f4/7d0f4244525f8b9c2f4a13d6fb619b5eb42cec3f" alt="Click image for larger version
Name: Enumerator Subfolders.JPG
Views: 1
Size: 72.2 KB
ID: 1650712"
I store the .txt files for the day on these enumerator subfolders.data:image/s3,"s3://crabby-images/10adf/10adff2d5ced70cb105b74626db92745d752bd64" alt="Click image for larger version
Name: TXT files.JPG
Views: 1
Size: 66.7 KB
ID: 1650713"
*Task: Import and combine files that end in "summary" to form one dataset
global DIREC_MAIN "C:\Users\Dropbox\SEL\Endline Survey\Data" // Enter file path to project folder for next user here
* Day 1
local day_number 1 // Change this number to reflect daily folder
save "${DIREC_MAIN}\CPX\CPX`day_number'.dta", replace emptyok // Create a blank data file for the day
forvalues i=1/60 { // My files are serialed 1 to 60. Adjust to suit number of files in your subfolders
cd "${DIREC_MAIN}\CPX\Day`day_number'\Unzipped\CP T-X (`i')" // Directory for the day
fs *-summary.txt // List names of .txt files within directory with -summary suffix (fs is userwritten command)
foreach f in `r(files)' {
import delimited "`f'", delimiters(tab) clear // Import .txt file with -summary suffix one at a time
gen s1_1="`f'" // Inherit name of the .txt file so that it can be traced back. Each file name is unique
tostring subjectid, replace // Unique identifier for each respondent. String is better not to loose information
replace subjectid=lower(subjectid)
append using "${DIREC_MAIN}\CPX\CPX`day_number'.dta" // Combine with the data file for the day
save "${DIREC_MAIN}\CPX\CPX`day_number'.dta", replace
}
}
************************************************** ************************************************** **************
* Append all days together
cd "${DIREC_MAIN}"
append using "CPX\CPX1" "CPX\CPX2" "CPX\CPX3" "CPX\CPX4" "CPX\CPX5" "CPX\CPX6" "CPX\CPX7" "CPX\CPX8" "CPX\CPX9" // The combined dataset
While SurveyCTO records is accessed as one data file (.xlsx, .csv), the Psych App submits 3 separate .txt files.
In addition, accurately measure self control through this app, I need each respondent to play the Psych App thrice.
In the end I need to combine 4800*3*3 = 43,200 individual .txt files to form one dataset.
Further complication, data is submitted daily, so I create daily folder and within each folder I create subfolder for each enumerator.
Within each folder, I have these subfolders (1 to 60)
I store the .txt files for the day on these enumerator subfolders.
*Task: Import and combine files that end in "summary" to form one dataset
global DIREC_MAIN "C:\Users\Dropbox\SEL\Endline Survey\Data" // Enter file path to project folder for next user here
* Day 1
local day_number 1 // Change this number to reflect daily folder
save "${DIREC_MAIN}\CPX\CPX`day_number'.dta", replace emptyok // Create a blank data file for the day
forvalues i=1/60 { // My files are serialed 1 to 60. Adjust to suit number of files in your subfolders
cd "${DIREC_MAIN}\CPX\Day`day_number'\Unzipped\CP T-X (`i')" // Directory for the day
fs *-summary.txt // List names of .txt files within directory with -summary suffix (fs is userwritten command)
foreach f in `r(files)' {
import delimited "`f'", delimiters(tab) clear // Import .txt file with -summary suffix one at a time
gen s1_1="`f'" // Inherit name of the .txt file so that it can be traced back. Each file name is unique
tostring subjectid, replace // Unique identifier for each respondent. String is better not to loose information
replace subjectid=lower(subjectid)
append using "${DIREC_MAIN}\CPX\CPX`day_number'.dta" // Combine with the data file for the day
save "${DIREC_MAIN}\CPX\CPX`day_number'.dta", replace
}
}
************************************************** ************************************************** **************
* Append all days together
cd "${DIREC_MAIN}"
append using "CPX\CPX1" "CPX\CPX2" "CPX\CPX3" "CPX\CPX4" "CPX\CPX5" "CPX\CPX6" "CPX\CPX7" "CPX\CPX8" "CPX\CPX9" // The combined dataset
Comment