Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Power analysis for zero-inflated order probit (ZIOP) help

    Hi everyone,

    I am new to STATA for my thesis, as I need to run a ZIOP model for a zero-inflated ordinal substance use variable in youth (sample size is about 250 and about 180 of the values are zeros). I am trying to figure out estimating power for one continuous predictor that will be utilized in both the binary probit and ordered probit component of the model (with age added as a covariate; unsure if this matters). All of my colleagues that I have as resources for stats are fairly unfamiliar with this model in itself, especially estimating power for it. I would thoroughly appreciate any help you all can provide, as I'm fairly lost on available resources I've found online (which only relate to ZINB or ZIP models, so that makes it even more challenging). Thanks a ton for any help!!

  • #2
    HTML Code:
    https://jds-online.org/journal/JDS/article/1044/file/pdf
    https://www.stata.com/statalist/archive/2013-05/msg00102.html
    https://statisticalhorizons.com/zero-inflated-models/

    Comment


    • #3
      So, I think you may have asked this question on Reddit? We do ask here that you state if you've cross-posted to avoid duplication.

      There, I (bill-smith is me, it's a long story) did say that you should consider a standard ordered probit or logit. I meant to consider it as the analytic model, rather than just for power analysis.

      Continuing what I said there, I'll assume that your dependent variable is something like a rating of substance use intensity, e.g. 0 = I don't drink alcohol at all, 4 = I drink very frequently. I you use a zero-inflated model, you're assuming that one group of respondents is always going to respond 0, and the rest of them are going to follow an ordered probit model, i.e. you can still respond 0. This setup essentially means that one subgroup of your respondents is not vulnerable to alcohol use, or is never going to say that they use alcohol. You can decide for yourself if that's what you mean. I would have some trouble buying that this setup is applicable, and also that adding zero-inflation adds anything in technical terms to a regular probit model (remember, in these, you are still assuming that 0s are possible).

      If you are determined to fit a zero-inflated model or to (metaphorically!!) die trying, then simulating that adds some complications. You'd just simulate a proportion of respondents whose answer is 0, e.g. a fixed probability of 20%. Ideally, you'd think about how that probability varies with observed characteristics, and what those are, e.g. the probability of being a zero follows a logit model with some base odds plus some assumed slope for various characteristics. What are your assumptions here, and how do you justify those base assumptions? Again, if zero inflation is the hill you are determined to die on (again, I do not advise this), then I think you should consider this in your power calculation.

      Also, do you have to have a power analysis pre-specified? It sounds like you already have your data. If you do, it's too late to calculate your power a priori.
      Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

      When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

      Comment


      • #4
        Hi! Sorry for the delay in response, I thought I had hit Post Reply but something must have happened.

        George Ford, thank you for providing those resources to read, super helpful for me to get some perspective and ideas on what this looks like and why these models may or may not be necessary!

        Weiwen Ng/bill-smith, I appreciate the thoughtful response and description of what the model is actually doing. I did not realize that specific rule about cross-posting and will make sure to add that information in future posts!

        I see what you're saying about the assumption made for what these latent classes of zeroes represent, and the point is well-taken--I reached out to those I am working under to see what their thoughts are on it being used in this context. Your description of how I would go about simulation is very helpful, I struggled to understand exactly the way you would put together fake data like this and then apply it to a simulated model. Given the very low number of studies in this area with modeling in a similar framework, I realize these assumptions are difficult to find/justify for sure. Since I already have my data, I ended up being advised shortly after this post to not conduct the a priori power analysis as it's not very meaningful with existing data (as you had you stated also). In the future though, this information will be very helpful for me when proposing a data collection that may use zero-inflated modeling.

        Again, I really appreciate all the thoughts and assistance, and am happy to know there is a community here that I can turn to!

        Comment

        Working...
        X