Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multicollinearity and Exploratory Structural Equation Modelling (ESEM) in Stata

    Hi Stata community,

    I'm using SEM to investigate young people’s wellbeing. Data is self-reported, measured multidimensionally via a pre-validated instrument. Wellbeing is operationalised using 20 items measuring four different subconstructs (Interpersonal, Life Satisfaction, Negative and Eudaimonic). My research seeks to quantify each subconstruct's relative association with my outcome variable. At present I'm running EFA to build a model that best represents how each of the four wellbeing subconstructs as distinct but related.

    Potential multicollinearity between the four subconstructs led me to consider using Exploratory Structural Equation Modelling (ESEM), which is now widely used in my field. However, Prof. Ender's materials here are the only resource I'm able to find on how to conduct ESEM in Stata. In terms of the extent of multicollinearity in my data, my exploration (scrutinising VIFs, bivariate correlations between variables and AVEs) found some evidence this may be problematic (output below). However, when building the model up into the full SEMs (i.e., Model 1 with only one wellbeing subconstruct, Model 2 with two wellbeing subconstructs... etc.) follow-up comparisons of path coefficients and SEs for each of the SEMs don't suggest multicollinearity is causing estimations to change too drastically (output again is below for Stata community to scrutinize).

    My questions are:
    1. How ubiquitous is the use of ESEM in Stata in the way proposed by Prof. Ender? I've managed to replicate this with my data, but I'm puzzled by the lack of available resources for ESEM in Stata which is making me wonder why more resources are not available? I'm wondering whether I will struggle to use ESEM together with the structural part of my models due to complexity.
    2. When exploring multicollinearity in the context of SEM, should sum scores, factor scores or individual items be scrutinised when it comes to looking at VIFs, correlations and AVEs?
    3. Should the evidence of multicollinearity between my variables (output below) give me cause for concern when it comes to entering all four of these latent exogenous variables into a SEM together?
    Please note: my RQ cannot be answered using a second-order model (i.e., with one general wellbeing factor predicting my outcome), I'm trying to simultaneously estimate the effect of the four different subconstructs.

    Example of my data
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long(wbs1 wbs4 wbs8 wbs10 wbs18 wbs5 wbs13 wbs17 wbs19 wbs6 wbs7 wbs14 wbs20 wbs21 wbs2 wbs3 wbs9 wbs15)
    2 4 4 4 3 3 3 3 3 3 2 3 2 2 2 2 2 5
    1 5 5 5 5 1 1 1 1 2 1 5 1 1 1 1 2 1
    3 2 4 4 3 2 4 3 2 3 3 4 3 3 2 3 2 2
    2 5 4 4 3 3 1 2 2 3 3 3 2 4 3 2 3 2
    1 5 5 5 4 1 1 2 1 2 1 1 1 1 1 2 2 1
    4 3 5 5 3 3 3 2 4 3 4 3 5 2 2 4 4 4
    5 2 2 3 1 5 3 4 4 3 2 4 4 3 5 4 4 4
    2 4 5 4 5 3 2 2 3 2 2 2 2 2 3 2 2 2
    3 3 5 5 3 4 1 2 3 2 2 4 3 2 3 2 2 3
    1 5 5 5 5 1 1 1 1 1 1 1 1 1 1 1 1 1
    2 4 5 5 1 2 2 2 2 3 2 2 2 3 4 2 4 2
    3 3 4 4 3 3 4 4 4 3 3 4 4 4 3 4 4 4
    3 3 4 2 4 2 3 3 3 2 1 4 3 2 3 4 3 3
    3 4 5 3 3 3 3 2 3 2 3 4 2 4 5 3 3 3
    4 2 3 3 3 3 4 3 4 4 4 4 4 4 4 4 4 4
    1 4 4 3 5 3 1 2 2 3 1 2 2 2 2 1 1 2
    4 2 4 3 3 3 3 3 3 4 4 3 3 3 4 4 3 3
    3 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3
    1 4 5 5 4 4 1 2 1 2 1 2 3 1 2 2 2 1
    . . . . . . . . . . . . . . . . . .
    . . . . . . . . . . . . . . . . . .
    end
    label def WB 1 "Never", modify
    label def WB 2 "Not Often", modify
    label def WB 3 "Sometimes", modify
    label def WB 4 "Often", modify
    label def WB 5 "Always", modify
    Evidence of multicollinearity
    Code:
     
    **(1) Check VIFs for all factors in dataset. VIFs that are >.10 suggest a problem **
    ** Using sumscores of the wellbeing subconstructs**
    
    . regress outcomevariable wbsint_sum2 wbseud_sum2 wbslife_sum2 wbsneg_sum2
    
          Source |       SS           df       MS      Number of obs   =       875
    -------------+----------------------------------   F(4, 870)       =     12.97
           Model |  162.409346         4  40.6023365   Prob > F        =    0.0000
        Residual |  2722.69465       870  3.12953409   R-squared       =    0.0563
    -------------+----------------------------------   Adj R-squared   =    0.0520
           Total |    2885.104       874  3.30103432   Root MSE        =     1.769
    
    ------------------------------------------------------------------------------
    outcomevariable~h | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
     wbsint_sum2 |  -.0289679   .0233199    -1.24   0.214    -.0747377    .0168019
     wbseud_sum2 |   .1358681   .0242438     5.60   0.000     .0882849    .1834513
    wbslife_sum2 |  -.0358798   .0248797    -1.44   0.150    -.0847111    .0129515
     wbsneg_sum2 |   -.002507   .0261081    -0.10   0.924    -.0537493    .0487352
           _cons |   3.981834   .6141696     6.48   0.000     2.776406    5.187261
    ------------------------------------------------------------------------------
    
    . vif
    
        Variable |       VIF       1/VIF  
    -------------+----------------------
     wbseud_sum2 |      3.20    0.312717
    wbslife_sum2 |      2.94    0.340589
     wbsint_sum2 |      2.53    0.395344
     wbsneg_sum2 |      1.92    0.520799
    -------------+----------------------
        Mean VIF |      2.65
    
    . 
    ** Using factor scores of wellbeing subconstructs 
    . regress outcomevariable Eudaimonic Interpersonal Lifesat Negative
    
          Source |       SS           df       MS      Number of obs   =       917
    -------------+----------------------------------   F(4, 912)       =     12.90
           Model |  163.757132         4  40.9392829   Prob > F        =    0.0000
        Residual |  2893.38791       912  3.17257446   R-squared       =    0.0536
    -------------+----------------------------------   Adj R-squared   =    0.0494
           Total |  3057.14504       916  3.33749458   Root MSE        =    1.7812
    
    -------------------------------------------------------------------------------
    outcomevariable | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    --------------+----------------------------------------------------------------
       Eudaimonic |   .9078153    .190283     4.77   0.000     .5343718    1.281259
    Interpersonal |  -.4550978   .1775992    -2.56   0.011    -.8036483   -.1065472
          Lifesat |    -.29361    .266601    -1.10   0.271    -.8168328    .2296127
         Negative |  -.1445177   .0541875    -2.67   0.008    -.2508643   -.0381711
            _cons |   4.854106   .2617944    18.54   0.000     4.340317    5.367896
    -------------------------------------------------------------------------------
    
    . vif
    
        Variable |       VIF       1/VIF  
    -------------+----------------------
         Lifesat |     13.09    0.076418
      Eudaimonic |     10.95    0.091284
    Interperso~l |      9.01    0.110954
        Negative |      1.03    0.967547
    -------------+----------------------
        Mean VIF |      8.52
    
    . 
    *Using individual items
    . regress outcomevariable wbs1 wbs2 wbs3 wbs9 wbs15 wbs6 wbs7 wbs11 wbs14 wbs21 wbs5 wbs13 wbs17 wbs19 wbs20 
    
          Source |       SS           df       MS      Number of obs   =       875
    -------------+----------------------------------   F(15, 859)      =      7.11
           Model |  318.516808        15  21.2344539   Prob > F        =    0.0000
        Residual |  2566.58719       859  2.98787799   R-squared       =    0.1104
    -------------+----------------------------------   Adj R-squared   =    0.0949
           Total |    2885.104       874  3.30103432   Root MSE        =    1.7285
    
    ------------------------------------------------------------------------------
    outcomevariable~h | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            wbs1 |   .0059105   .1001725     0.06   0.953    -.1907011    .2025221
            wbs2 |    .093906     .08118     1.16   0.248    -.0654283    .2532404
            wbs3 |   .3839984   .0871796     4.40   0.000     .2128884    .5551084
            wbs9 |   .3026809   .0830548     3.64   0.000     .1396668    .4656949
           wbs15 |  -.0773907   .0832107    -0.93   0.353    -.2407107    .0859294
            wbs6 |  -.2069086   .0983767    -2.10   0.036    -.3999954   -.0138218
            wbs7 |  -.0988398    .098042    -1.01   0.314    -.2912697      .09359
           wbs11 |  -.1612017   .0884901    -1.82   0.069    -.3348838    .0124805
           wbs14 |   .0466253   .0756693     0.62   0.538     -.101893    .1951437
           wbs21 |   .1308543   .0767839     1.70   0.089    -.0198518    .2815604
            wbs5 |  -.2193482    .079129    -2.77   0.006    -.3746569   -.0640395
           wbs13 |  -.0558144    .078205    -0.71   0.476    -.2093097    .0976808
           wbs17 |   .0219586   .0921006     0.24   0.812      -.15881    .2027272
           wbs19 |  -.0667771   .0954528    -0.70   0.484    -.2541252     .120571
           wbs20 |   .2736249   .0985175     2.78   0.006     .0802617     .466988
           _cons |   3.773488   .2565241    14.71   0.000         3.27    4.276975
    ------------------------------------------------------------------------------
    
    . vif
    
        Variable |       VIF       1/VIF  
    -------------+----------------------
            wbs1 |      3.16    0.316891
            wbs7 |      3.04    0.328607
            wbs6 |      2.87    0.348388
           wbs20 |      2.79    0.358511
            wbs3 |      2.55    0.392300
           wbs19 |      2.49    0.401982
           wbs15 |      2.49    0.402062
           wbs17 |      2.29    0.437160
           wbs13 |      2.21    0.452178
            wbs2 |      2.16    0.462730
            wbs9 |      2.03    0.492234
           wbs11 |      1.98    0.504404
           wbs21 |      1.88    0.532478
            wbs5 |      1.84    0.544028
           wbs14 |      1.82    0.549123
    -------------+----------------------
        Mean VIF |      2.37
    
    . 
    collin wbsint_sum2 wbseud_sum2 wbslife_sum2 wbsneg_sum2
    (obs=897)
    
      Collinearity Diagnostics
    
                            SQRT                   R-
      Variable      VIF     VIF    Tolerance    Squared
    ----------------------------------------------------
    wbsint_sum2      2.52    1.59    0.3964      0.6036
    wbseud_sum2      3.15    1.77    0.3175      0.6825
    wbslife_sum2      2.96    1.72    0.3377      0.6623
    wbsneg_sum2      1.91    1.38    0.5247      0.4753
    ----------------------------------------------------
      Mean VIF      2.63
    
                               Cond
            Eigenval          Index
    ---------------------------------
        1     4.7957          1.0000
        2     0.1590          5.4926
        3     0.0199         15.5422
        4     0.0186         16.0768
        5     0.0070         26.2302
    ---------------------------------
     Condition Number        26.2302 
     Eigenvalues & Cond Index computed from scaled raw sscp (w/ intercept)
     Det(correlation matrix)    0.0867
    
    . 
    collin Eudaimonic Interpersonal Lifesat Negative 
    (obs=942)
    
      Collinearity Diagnostics
    
                            SQRT                   R-
      Variable      VIF     VIF    Tolerance    Squared
    ----------------------------------------------------
    Eudaimonic     10.89    3.30    0.0918      0.9082
    Interpersonal      9.11    3.02    0.1097      0.8903
       Lifesat     13.31    3.65    0.0751      0.9249
      Negative      1.03    1.01    0.9707      0.0293
    ----------------------------------------------------
      Mean VIF      8.59
    
                               Cond
            Eigenval          Index
    ---------------------------------
        1     4.7737          1.0000
        2     0.1738          5.2402
        3     0.0384         11.1434
        4     0.0086         23.5783
        5     0.0055         29.5547
    ---------------------------------
     Condition Number        29.5547 
     Eigenvalues & Cond Index computed from scaled raw sscp (w/ intercept)
     Det(correlation matrix)    0.0111
    
    . 
    ** (2) How are the factors correlated? [estat common command, run after EFA]
    . 
    ** Eudaimonic*Interpersonal = .42, Eudaimonic*Lifesat = .45, Eudaimonic*Negative = -.35
    . ** Interpersonal*Lifesat = .49, Interpersonal*Negative = -.26
    . ** Lifesat*Negative = -.33
    . 
    ** Correlation between sum scores 
    . pwcorr wbsint_sum2 wbseud_sum2 wbslife_sum2 wbsneg_sum2, sig star(0.05)
    
                 | wbsint~2 wbseud~2 wbslif~2 wbsneg~2
    -------------+------------------------------------
     wbsint_sum2 |   1.0000 
                 |
                 |
     wbseud_sum2 |   0.7248*  1.0000 
                 |   0.0000
                 |
    wbslife_sum2 |   0.7304*  0.7664*  1.0000 
                 |   0.0000   0.0000
                 |
     wbsneg_sum2 |  -0.5741* -0.6641* -0.6183*  1.0000 
                 |   0.0000   0.0000   0.0000
                 |
    
    ** Correlation between factor scores 
    . pwcorr Eudaimonic Interpersonal Lifesat, sig star(0.05)
    
                 | Eudaim~c Interp~l  Lifesat
    -------------+---------------------------
      Eudaimonic |   1.0000 
                 |
                 |
    Interperso~l |   0.9206*  1.0000 
                 |   0.0000
                 |
         Lifesat |   0.9474*  0.9368*  1.0000 
                 |   0.0000   0.0000
                 |
    
    ** Correlation between individual items
    . pwcorr wbs1 wbs2 wbs3 wbs9 wbs15 wbs6 wbs7 wbs11 wbs14 wbs21 wbs5 wbs13 wbs17 wbs19 wbs20, sig star(0.05)
    
                 |     wbs1     wbs2     wbs3     wbs9    wbs15     wbs6     wbs7
    -------------+---------------------------------------------------------------
            wbs1 |   1.0000 
                 |
                 |
            wbs2 |   0.6764*  1.0000 
                 |   0.0000
                 |
            wbs3 |   0.7035*  0.6051*  1.0000 
                 |   0.0000   0.0000
                 |
            wbs9 |   0.5713*  0.5334*  0.6235*  1.0000 
                 |   0.0000   0.0000   0.0000
                 |
           wbs15 |   0.6973*  0.5813*  0.6441*  0.5725*  1.0000 
                 |   0.0000   0.0000   0.0000   0.0000
                 |
            wbs6 |   0.5458*  0.5208*  0.5091*  0.4724*  0.4863*  1.0000 
                 |   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
            wbs7 |   0.6242*  0.5624*  0.5780*  0.4985*  0.5582*  0.7516*  1.0000 
                 |   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
           wbs11 |   0.5396*  0.4928*  0.4923*  0.5210*  0.5197*  0.5714*  0.5925*
                 |   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
           wbs14 |   0.4532*  0.4397*  0.4382*  0.4426*  0.4388*  0.5454*  0.4681*
                 |   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
           wbs21 |   0.4118*  0.4084*  0.4125*  0.3987*  0.3871*  0.5836*  0.5578*
                 |   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
            wbs5 |   0.5286*  0.4962*  0.4735*  0.4370*  0.4864*  0.4935*  0.5113*
                 |   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
           wbs13 |   0.5548*  0.4799*  0.5111*  0.4934*  0.5427*  0.5100*  0.5396*
                 |   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
           wbs17 |   0.5248*  0.4694*  0.5115*  0.4646*  0.4958*  0.4940*  0.5123*
                 |   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
           wbs19 |   0.5476*  0.5085*  0.5384*  0.5186*  0.5574*  0.5188*  0.5378*
                 |   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
           wbs20 |   0.6407*  0.5552*  0.5755*  0.5575*  0.6139*  0.6026*  0.5939*
                 |   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
    
                 |    wbs11    wbs14    wbs21     wbs5    wbs13    wbs17    wbs19
    -------------+---------------------------------------------------------------
           wbs11 |   1.0000 
                 |
                 |
           wbs14 |   0.4462*  1.0000 
                 |   0.0000
                 |
           wbs21 |   0.4601*  0.5494*  1.0000 
                 |   0.0000   0.0000
                 |
            wbs5 |   0.4231*  0.3201*  0.3563*  1.0000 
                 |   0.0000   0.0000   0.0000
                 |
           wbs13 |   0.5202*  0.4238*  0.4253*  0.4630*  1.0000 
                 |   0.0000   0.0000   0.0000   0.0000
                 |
           wbs17 |   0.4878*  0.3967*  0.4295*  0.5101*  0.6532*  1.0000 
                 |   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
           wbs19 |   0.4864*  0.3686*  0.4463*  0.5971*  0.5770*  0.6331*  1.0000 
                 |   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
           wbs20 |   0.5606*  0.5077*  0.5242*  0.5419*  0.6112*  0.6289*  0.6593*
                 |   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000   0.0000
                 |
    
    ** (3) (Check the Average Variance Extracted in the measurement models )
    . 
    ** type 'condisc' after running SEM to assess the average variance extracted (AVE) **
    . sem (Eudaimonic -> wbs1@1 wbs2 wbs3 wbs9 wbs15)(Interpersonal -> wbs6@1 wbs7 wbs11 wbs14 wbs21)(Lifesat -> wbs5@1 wbs13 wbs17 wbs19 wbs20)(Negative -> wbs4 wbs8 wbs10), latent(Eudaimonic Interpersonal Lifesat Negative) stand
    
    Endogenous variables
      Measurement: wbs1 wbs2 wbs3 wbs9 wbs15 wbs6 wbs7 wbs11 wbs14 wbs21 wbs5 wbs13 wbs17 wbs19 wbs20 wbs4 wbs8 wbs10
    
    Exogenous variables
      Latent: Eudaimonic Interpersonal Lifesat Negative
    
    [Full output omitted] 
    
    -----------------------------+----------------------------------------------------------------
    cov(Eudaimonic,Interpersonal)|   .8145113    .016024    50.83   0.000     .7831049    .8459178
          cov(Eudaimonic,Lifesat)|   .8613038    .013381    64.37   0.000     .8350775      .88753
         cov(Eudaimonic,Negative)|  -.7648549   .0235044   -32.54   0.000    -.8109226   -.7187872
       cov(Interpersonal,Lifesat)|   .8369258    .015206    55.04   0.000     .8071225     .866729
      cov(Interpersonal,Negative)|  -.6530577    .028078   -23.26   0.000    -.7080895   -.5980259
            cov(Lifesat,Negative)|  -.7470133   .0242314   -30.83   0.000     -.794506   -.6995207
    ----------------------------------------------------------------------------------------------
    LR test of model vs. saturated: chi2(129) = 562.28                        Prob > chi2 = 0.0000
    
    . condisc 
    
                      Convergent and Discriminant Validity Assessment
    ------------------------------------------------------------------------------------------
    Squared  correlations (SC) among  latent  variables              
    ------------------------------------------------------------------------------------------
    
                    Eudaimonic  Interperso~l       Lifesat      Negative
      Eudaimonic         1.000
    Interperso~l         0.663         1.000
         Lifesat         0.742         0.700         1.000
        Negative         0.585         0.426         0.558         1.000
    
    ------------------------------------------------------------------------------------------
    Average variance extracted (AVE) by latent variables               
    ------------------------------------------------------------------------------------------
    type mismatch
    r(109);
    
    end of do-file
    Model comparisons (path coefficients and SEs in full SEM models) - NB. Models adding new wellbeing subconstructs with full four-factor SEM last (right hand-side)
    Code:
    estimates table eud eudint interp lifesat neg eudintlife eudintlifeneg, se
    
    ---------------------------------------------------------------------------------------------------------
        Variable |    eud         eudint       interp      lifesat        neg       eudintlife   eudintli~g  
    -------------+-------------------------------------------------------------------------------------------
    outcomevariable~h 
      Eudaimonic |  .36133661    .75222035                                           .85003712    .86322774  
                 |  .06790982     .1384507                                           .18293653    .19670361  
    Interperso~l |              -.48209672    .11152458                              -.4168596   -.42507065  
                 |                .1372821     .0661019                              .15463278    .15633722  
         Lifesat |                                         .25266663                -.21519739   -.16072192  
                 |                                         .08652639                 .23852414    .25306004  
        Negative |                                                     -.22298494                 .07071995  
                 |                                                      .07563153                  .1527385
    Many thanks in advance for your time and expertise.

    Kind regards,
    Tania

Working...
X