Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Performing a Mediation Analysis for Fixed Effects Model using bootstrap

    Dear all,

    I'd like to perform a mediation analysis using a fixed effects regression model. My understanding is, that I 1) calculate the base model without the moderator 2) calculate a FE regression of the IVs on the moderator and 3) the full model of the IVs and the moderator on the DV and than check wether the indirect effect is significant using bootstrapping. I have calculated the following so far:

    Code:
    egen companyid = group(Company), label lname(companyid)
    xtset companyid PeriodQ
    
    
    * Total Effect (Dependent Variable on Independent Variable:)
    xtreg DV $IVs $controls i.year, fe vce(robust)
    
    *IV regressed on Mediator (path a)
    xtreg M $IVs $controls i.year, fe vce(robust)
    
    * Dependen Variable regressed on mediator and independen variable
    xtreg DV $IVs $controls M i.year, fe vce(robust)
    
    * Check significance via Bootstrap 
    
    xtset, clear
    
    bootstrap ([M]_b[IV]*[DV]_b[M]), bca reps(1000) nodots cluster(companyid) : xtreg (M $IVs $controls)(DV M $IVs $controls) `if' `in'
    estat boot, percentile bc bca
    But the bootstrapping does not work and I think I haven't completely understood how the command works.
    So I have 2 questions:
    1) Is the overall approach correct?
    2) How do I have to adapt the bootstrapping command? As mentioned I try to answer the question whether the indirect/directeffect is significant and what extend of the total effect is explained via the indirect/direct effect

    Thanks a lot in advance!

    P.S. I'm using Stata 13 on Windows

  • #2
    FYI: I also posted the question here: https://stats.stackexchange.com/q/474041/277429

    Comment


    • #3
      Hi Tim,
      xtreg does not operate that way. you need to down all coefficients Below I provide you with some simplified version, that you could adapt for your case:
      THis code estimates what could be considered as simultaneous regressions using Bootstrap.
      Code:
      clear all
      sysuse auto
      program bs_sureg, eclass
          reg price headroom trunk weight
          matrix b1=e(b)
          reg mpg headroom trunk weight
          matrix b2=e(b)
          matrix coleq b1=price
          matrix coleq b2=mpg
          matrix b=b1,b2
          ereturn post b
      end
      
      bootstrap:bs_sureg

      Comment


      • #4
        Thank you FernandoRios! I think I'm 90% there. What I didn't figure out is how to save the results of each bootstrap for specific variables. Currently I just have an output of the overall result including all variables, which is quite nasty as I had to include all companies and periods as dummies. Here is my current code:
        Code:
        program bs_sureg, eclass
            xi: reg DV $IVs $controls i.year i.Company, vce(robust)
            
            matrix b1=e(b)
            xi: reg M $IVs $controls i.year i.Company, vce(robust)
            
            matrix b2=e(b)
            xi: reg DV $IVs M $controls i.year i.Company, vce(robust)
            
            matrix b3=e(b)
            matrix coleq b1=total
            matrix coleq b2=a_coeff
            matrix coleq b3=b_coeff
            matrix b_boot = b1,b2,b3
            ereturn post b_boot
        end
        
        bootstrap, reps(1000) nodots:bs_sureg
        How can I save the results of each bootstrap and from there how would I measure, whether the direct and indirect effects are significant and how much of the total effect is explained by each? I tried
        Code:
        putexcel set bootstrap.xlsx, sheet(bootstrap) replace
        putexcel A1 = matrix(b1,b2,b3)
        But that resulted in a really messed up excel sheet

        Comment


        • #5
          Dear Tim,

          To my knowledge, there are two approaches.

          In the first approach, you would estimate both models simultaneously, after which you test the difference in coefficients. You could do that by running -test- after your bootstrap program.

          An alternative way would be to "stack" the models, which is faster because you don't need the bootstrap. There is a nice Stata FAQ about this as well as an older comment on the list, do check them out. Here's an example:
          Code:
          expand 2, gen(copied)
          foreach x in x1 x2 x3 mediator {
          gen `x'_base = `x'*(copied==0) gen `x'_full = `x'*(copied==1)
          } egen newid = (oldid copied) xtset newid xtreg y x1_base x2_base x3_base x1_full x2_full x3_full mediator_full, fe vce(cluster oldid) test (x1_base-x1_full=0)

          In the second approach, you would estimate the difference between the coefficients itself. That would look something like this:

          Code:
          program define bs_difference, rclass
          xtset newid xtreg y x1 x2 x3, fe scalar coef_base = _b[x1] xtreg y x1 x2 x3 mediator, fe scalar coef_full = _b[x1] return scalar coef_diff = (coef_base - coef_full)
          end bootstrap r(coef_diff), reps(1000) cluster(oldid) idcluster(newid): bs_difference

          I don't have enough background in statistics to say which approach is better. My guess is that, with sufficient case numbers, the difference is minor.
          Last edited by Bram Hogendoorn; 07 Jan 2021, 04:21.

          Comment

          Working...
          X