Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Correct for multiple testing

    Dear statalisters,


    I learned about multiple testing only recently, and it seems like a serious problem.

    I am studying the effect of exposure to an event on different health outcomes.
    I have many dependent variables (different health outcomes) regressed on a vector of 1 key variable and other covariates, using different specifications (different fixed effects)


    I have a model such as:
    Code:
    Y1 = betaX  fixed_effects_11
    Y1 = betaX  fixed_effects_22
    Y1 = betaX  fixed_effects_33
    
    Y2 = betaX  fixed_effects_11
    Y2 = betaX  fixed_effects_22
    Y2 = betaX  fixed_effects_33

    If statistical significance is the critiria for deciding on conclusions,
    1. Can we ignore that we have many covariates in the same regression, if we only focus on one key independent variable, thus within each model we are testing only one hypothesis? Is this logic right?
    2. If we assume that each health outcome is of interest on its own, can we ignore correction for multiple testing?
    3. Any applied economics papers who corrected for having different models please?


    Other than the Stata command
    Code:
     
     wyoung
    Is there a Stata command that corrects for having different models with different dependent variables?

    Thank you

    References:
    Jones, D., D. Molitor, and J. Reif. "What Do Workplace Wellness Programs Do? Evidence from the Illinois Workplace Wellness Study." Quarterly Journal of Economics, November 2019, 134(4): 1747-1791.

  • #2
    Marry:
    welcome to this forum.
    Have you taken a look at -mvreg- and -mvreg postestimation- entries in Stata .pdf manual?
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Dear Marry, in addition to Carlo's helpful response, multiple hypothesis testing adjustment might be of interest. I suggest taking a look at rwolf2 from ssc install.

      Comment


      • #4
        Carlo Lazzaro
        Thank you so much for your answer.
        You are saying that since dependent variables are health outcomes that may be correlated, they may be modeled in a multivariate model.
        If I understand well, since the health outcomes are correlated then I am almost testing the same hypothesis: the effect of exposure to the event is associated with better health / worse health.
        Multivariate analysis will maximize power while holding the type I error rate at alpha level.
        Am I getting this right?

        Comment


        • #5
          Thank you Maxence Morlet For this very interesting command.

          So rwolf2 allows to correct for having different dependent variables in my analysis.
          I just need to include all regressions from main analysis and heterogeneity and probably mechanisms analysis at the same time and have the adjusted p values. Did I understand this right?

          Can you please explain why this command may be better and not another for correction? as I want to justify my choice of one or the other commands.

          Comment


          • #6
            The Github respository for `wyoung` provides several examples for how to use that command in a regression setting. David McKenzie provides a nice overview of different multiple hypothesis testing adjustments in his blog post. I have not used `rwolf2`, but am not aware of a reason to prefer `wyoung` or `rwolf2` over the other.
            Associate Professor of Finance and Economics
            University of Illinois
            www.julianreif.com

            Comment


            • #7
              Marry:
              I would go -mvreg- as it allows you to estimate the between-equation covariances.
              Unfortunately, I'm not familiar with the community-contributed modules suggested by Maxence and Julian.
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                Julian Reif Thank you so much for your confirmation and for the link to the blog. It was really helpful.
                Best

                Comment


                • #9
                  I'd read up on it thoroughly to make sure you need the correction in your specific case.

                  I believe it to be the case when you estimate the coefficients simultaneously, no adjustment is required. If you estimate the models separately, then you would apply it.

                  mvreg is a simultaneous estimation approach. But, all the X's (including FE) must be the same in mvreg. sureg allows different Xs and could estimate 6 models.

                  Not clear why you have different fixed effects, unless you are looking for differences in the X coef across different levels of aggregation of the FE. If the FE are at different levels of aggregation, there are other tests to determine whether higher aggregation is legitimate (Wooldridge/Papke). I written some code to run that test and posted it on Statalist before.

                  Comment


                  • #10
                    Hi George Ford
                    Thank you for your answer
                    So you say that if I use mvreg (with the same Xs for all models), there is no need for correction? right?
                    Last edited by Marry Lee; 08 Aug 2024, 04:17.

                    Comment


                    • #11
                      I believe that is correct.

                      Comment


                      • #12
                        I don't think this is correct. -mvreg- does not correct for multiple tests. For example the commands
                        Code:
                        sysuse auto, clear
                        mvreg headroom trunk turn = price mpg displ gear_ratio length weight
                        return exactly the same p values as if you ran three separate regressions:
                        Code:
                        reg headroom price mpg displ gear_ratio length weight
                        reg trunk price mpg displ gear_ratio length weight
                        reg turn price mpg displ gear_ratio length weight
                        If you would correct the p values of the 3 separate regressions, then you should also correct the p values returned by mvreg....

                        Comment

                        Working...
                        X