Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Append multiple txt files

    I have an active data collection anticipated to last around 50 days covering 4800 respondents. My team of 60 enumerators collects data using SurveyCTO and Psych App.
    While SurveyCTO records is accessed as one data file (.xlsx, .csv), the Psych App submits 3 separate .txt files.
    In addition, accurately measure self control through this app, I need each respondent to play the Psych App thrice.
    In the end I need to combine 4800*3*3 = 43,200 individual .txt files to form one dataset.
    Further complication, data is submitted daily, so I create daily folder and within each folder I create subfolder for each enumerator.
    Click image for larger version

Name:	Daily Folders.JPG
Views:	1
Size:	33.5 KB
ID:	1650711



    Within each folder, I have these subfolders (1 to 60)
    Click image for larger version

Name:	Enumerator Subfolders.JPG
Views:	1
Size:	72.2 KB
ID:	1650712



    I store the .txt files for the day on these enumerator subfolders.
    Click image for larger version

Name:	TXT files.JPG
Views:	1
Size:	66.7 KB
ID:	1650713



    *Task: Import and combine files that end in "summary" to form one dataset

    global DIREC_MAIN "C:\Users\Dropbox\SEL\Endline Survey\Data" // Enter file path to project folder for next user here

    * Day 1
    local day_number 1 // Change this number to reflect daily folder
    save "${DIREC_MAIN}\CPX\CPX`day_number'.dta", replace emptyok // Create a blank data file for the day

    forvalues i=1/60 { // My files are serialed 1 to 60. Adjust to suit number of files in your subfolders
    cd "${DIREC_MAIN}\CPX\Day`day_number'\Unzipped\CP T-X (`i')" // Directory for the day
    fs *-summary.txt // List names of .txt files within directory with -summary suffix (fs is userwritten command)
    foreach f in `r(files)' {
    import delimited "`f'", delimiters(tab) clear // Import .txt file with -summary suffix one at a time
    gen s1_1="`f'" // Inherit name of the .txt file so that it can be traced back. Each file name is unique
    tostring subjectid, replace // Unique identifier for each respondent. String is better not to loose information
    replace subjectid=lower(subjectid)
    append using "${DIREC_MAIN}\CPX\CPX`day_number'.dta" // Combine with the data file for the day
    save "${DIREC_MAIN}\CPX\CPX`day_number'.dta", replace
    }
    }

    ************************************************** ************************************************** **************
    * Append all days together
    cd "${DIREC_MAIN}"
    append using "CPX\CPX1" "CPX\CPX2" "CPX\CPX3" "CPX\CPX4" "CPX\CPX5" "CPX\CPX6" "CPX\CPX7" "CPX\CPX8" "CPX\CPX9" // The combined dataset
    Last edited by Elijah Kipchumba; 18 Feb 2022, 00:14.

  • #2
    Maybe this article helps: https://journals.sagepub.com/doi/pdf...867X0900900407
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment

    Working...
    X