Tipping point analysis with a single binary panel variable

Tyler Wray

Join Date: Apr 2015

Posts: 19
#1

Tipping point analysis with a single binary panel variable

06 May 2024, 18:01

I have a dataset with a binary variable (healthscreenanyq) reflecting whether or not patients got a health screening across 5 timepoints over the course of 12 months. Participants in this dataset are randomized to three conditions (cond). About 27% of observations of healthscreenanyq are missing. Using this variable, I calculated another variable that summarizing healthscreenanyq over the course of a year, assuming conservatively that missing data in healthscreenanyq relfected not having gotten screened. So, healthscreenanyy reflects whether participants ever reported being screened during the study.

I fit a logistic regression model for healthscreenanyy, and it showed that the adjusted probabilities of getting screened at least once were 57% in group 0, 91% in group 1, and 89% in group 2. Since there is quite a bit of missing data in healthscreenanyq, though, I was hoping to see what impact different scenarios for the missing data might have on these results. Specifically, I'm wondering if there's a way to determine what rates would need to be in the missing data in order to close the gap between group 0 and groups 1 and 2? That is, what would the rate of screening need to be in the 27% of missing data (overall) in order for group 0 to increase by 20%? Or, is there maybe a way to create new variables with different rates of screening in the missing data, so I can see what impact that would have manually? Basically looking for a version of tipping point analysis, & it seems like it should be a lot simpler than a lot of tutorials I've seen, just given that this is a single binary variable. Maybe there's an even simpler solution I'm missing?

Code:

* Example generated by -dataex-. For more info, type help dataex clear input long id byte(qmonth cond) float(healthscreenanyq healthscreenanyy) 100005 1 1 1 1 100005 4 1 1 1 100005 7 1 1 1 100005 10 1 1 1 100005 12 1 0 1 100006 1 2 1 1 100006 4 2 1 1 100006 7 2 1 1 100006 10 2 1 1 100006 12 2 0 1 100007 1 0 . 1 100007 4 0 1 1 100007 7 0 0 1 100007 10 0 0 1 100007 12 0 0 1 100008 1 1 1 1 100008 4 1 1 1 100008 7 1 . 1 100008 10 1 1 1 100008 12 1 . 1 end label values cond cond
Tags: None
George Ford

Join Date: Aug 2014

Posts: 3118
#2

07 May 2024, 12:19

At the extremes, you can just assign 0 to missing or 1 to missing. It will give you the full range.

you could loop through various levels (though I'd probably do it repeatedly):

forv i = 0/1 {
capture drop alt
g alt = healthyscreeninganyq
replace alt = `i' if mi(alt)
--do stuff--
}

forv i = 1(5)100 {
capture drop alt
g alt = healthyscreeninganyq
replace alt = 0 if runiform()>`i'/100 & mi(alt)
--do stuff--
}
Comment

Announcement

Tipping point analysis with a single binary panel variable

Comment