Estimating prevalence from a multiple response variable

Kehinde Atoloye

Join Date: Nov 2021
Posts: 104

Estimating prevalence from a multiple response variable

11 Dec 2024, 21:35

Hello all,

I have a multiple response variable of different methods. I intend to group the responses into two groups of method 1 and method 2 and then estimate the prevalence of method two. Below is the data and code that summarizes my attempt. I intend to do these for overall and for each state. The code is not working satisfactorily at the moment. I'm having no observation error. I guess the Step 2 part is the problem, but I have been trying to maneuver but to no avail. Your assistance is welcome. Thanks.

Code:

clear 
input str10 instanceid str20 adoptedfpmeth str20 state 
"1" "1 10 11" "State1" 
"2" "2 3 4" "State1" 
"3" "10 11" "State2" 
"4" "5 6" "State2" 
"5" "11" "State3" 
"6" "1 2 10" "State3" 
"7" "4 6 9" "State4" 
"8" "2 7 8" "State4" 
"9" "9 10" "State5" 
"10" "1 2 11" "State5" 
end

set trace on
* Step 1: Prepare and Reshape Data
split adoptedfpmeth, parse(" ") gen(response)
reshape long response, i(instanceid) j(_j)
drop _j

* Create binary variables for group 1 (1 to 9) and group 2 (10 to 11)
gen method1 = 0
gen method2 = 0
bysort instanceid: replace method1 = 1 if inlist(response, "1", "2", "3", "4", "5", "6", "7", "8", "9")
bysort instanceid: replace method2 = 1 if inlist(response, "10", "11")

* Create final binary variables indicating whether any method in each group was selected
bysort instanceid: egen method1_flag = max(method1)
bysort instanceid: egen method2_flag = max(method2)
drop method1 method2 response

* Calculate the number of unique respondents in method1_flag by state
bysort state instanceid: keep if _n == 1  // Keep only the first occurrence of each instanceid
bysort state: egen method1_count = total(method1_flag)
bysort state: egen total_respondents = count(instanceid)
gen proportion = round((method1_count / total_respondents) * 100, 0.1)

// decode state, gen(state1)

* Step 2: Initialize a new dataset for the results outside the loop
tempfile results
clear
set obs 0
gen state_name = ""
gen Statistic = ""
gen Value = ""
save `results', replace

* Step 3: Loop through each state to save the results
levelsof state, local(state_levels)
foreach state of local state_levels {
    preserve
    * Filter data for the current state
    keep if state == "`state'"
    
    * Calculate the necessary values
    local method1_count_str = string(method1_count[1])
    local total_respondents_str = string(total_respondents[1])
    local proportion_str = string(proportion[1], "%9.1f")  // Ensures proportion is rounded and formatted to 1 decimal place

    * Prepare the results for each state
    set obs 3
    gen state_name = "`state'"
    gen Statistic = ""
    gen Value = ""
    
    replace Statistic = "Method 1 Count" in 1
    replace Value = "`method1_count_str'" in 1
    replace Statistic = "Total Respondents" in 2
    replace Value = "`total_respondents_str'" in 2
    replace Statistic = "Proportion (%)" in 3
    replace Value = "`proportion_str'" in 3
    
    * Append results to the main dataset
    append using `results'
    save `results', replace
    
    restore
}

* Step 4: Load the final results
use `results'

* Export only the necessary variables to an Excel file
export excel state_name Statistic Value using "mcprresults.xlsx", sheet("State Proportions") firstrow(variables)

Tags: None

Announcement

Estimating prevalence from a multiple response variable