Structural equation model (SEM): degree of freedom and bootstrapping

Kai-Yuan Cheng

Join Date: Apr 2017

Posts: 10
#1

Structural equation model (SEM): degree of freedom and bootstrapping

22 Dec 2019, 12:20

Dear Statalist,

First of all, Happy holidays to the people on this forum!
I have 2 questions concerning structural equation modelling (SEM) using Stata SE 14 on Mac OS 10.13.

The first question is basically "how do I calculate the degree of freedom (DoF) for an SEM model". I am aware that the definition of DoF is:
number of information ( k(k+1/2) where k is the number of variables ) minus number of parameters one wishes to estimate.

However, this formula does not seem to work for my case, where I aim to run a very simple mediation model to test if loneliness mediates the association between stigma and depression in my sample (n=350) using the command:

Code:

sem (lonely -> depress, ) (stigma -> depress, ) (stigma -> lonely, ), nocapslatent

There are three variables here (k=3) so the number of information should be 6 (3(3+1)/2). I am only estimating 5 parameters (3 pathways shown in the sem code and 2 error variances). This should give a DoF of 1. I am confused why the results show that there is a 0 degree of freedom?

The second question concerns Bootstrap failures. Here I used another data (n=120, no missing value) to test the same mediation effect mentioned above, but as can be seen in my codes below, a measurement component is included so that there are 3 indicators for stigma, which is now represented by a latent variable. I have also adjusted for employment and education level.

Code:

. sem (latentstigma -> gih_m, ) (latentstigma -> lih_r, ) (latentstigma -> lih_atol, ) (latentstigma loneliness employment education -> depression, ) (latentstigma employment education -> loneliness, ), latent(latentih ) nocapslatent vce(bootstrap, reps(10) seed(1234)) (running sem on estimation sample) Bootstrap replications (10) ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .x.x..xx.. Structural equation model Number of obs = 120 Log likelihood = -1926.6482 Replications = 6

The results showed a high number of bootstrap failures (4 out of 10). It also took unusually long to run compared to my other, more complex, sem models (approx. 30 minutes for just 10 reps)
There has been previous report of bootstrap failure on this forum where the -noisily option is suggested to diagnose the bootstrap execution. I have done so with the following codes:

Code:

program bootsem1, rclass sem (latentstigma -> gih_m, ) (latentstigma -> lih_r, ) (latentstigma -> lih_atol, ) (latentstigma loneliness employment education -> depression, ) (latentstigma employment education-> loneliness, ), latent(latentstigma ) nocapslatent estat teffects, compact mat ind = r(indirect) mat dir = r(direct) mat tot = r(total) return scalar ind = ind[1,2] return scalar dirih = dir[1,2] return scalar dirlonely = dir[1,1] return scalar tot = tot[1,2] end set seed 1234 bootstrap r(ind) r(dirih) r(dirlonely) r(tot), noisily reps(10) : bootsem1

The results suggested the failed bootstraps ran more than 15,000 iterations and yielded the error message:

Code:

Convergence not achieved an error occurred when bootstrap executed bootsem1, posting missing values

These bootstraps are also the main reason it took so long.
As mentioned, this dataset has no missing value, I am therefore not sure how this came to be and would very much like to know what you think may have gone wrong.
I hope the above question has been presented clearly and following the correct formats.
Please kindly let me know if I can provide any additional information.
Any and all help is very deeply appreciated. Thank you in advance.

Kai-Yuan
Tags: None
Brian Flaherty

Join Date: Jun 2017

Posts: 11
#2

22 Dec 2019, 14:35

Happy holidays and new year to you too. I have not run SEM in Stata yet, but I can address your first question. You have zero degrees of freedom because you are estimating 2 error variances (the mediator and outcome), as well as one unconditional variance (the exogenous predictor). Thus your model is saturated, unless you remove a path or have information to somehow impose a constraint on one or more variances (conditional or unconditional).
Good luck,
Brian
Comment

Kai-Yuan Cheng

Join Date: Apr 2017
Posts: 10

23 Dec 2019, 05:09

Hi Brian,

Thank you very much for your reply. However, I am not sure I fully understood. It will be very kind of you if you will elaborate.
Isn't unconditional variance of the exogenous variable simply the variable's own variance, which is actually one piece of known information from the data (as far as I am concerned, information from the data are the variances of the variables and the covariances between the variables), rather than something the model needs to estimate?
Furthermore, reading the outputs from Stata, I have trouble seeing how and where the unconditional variance is being estimated:

Code:

Endogenous variables

Observed:  loneliness depression

Exogenous variables

Observed:  stigma

Fitting target model:

Iteration 0:   log likelihood =  -1313.622  
Iteration 1:   log likelihood =  -1313.622  

Structural equation model                       Number of obs     =        120
Estimation method  = ml
Log likelihood     =  -1313.622

----------------------------------------------------------------------------------
                 |                 OIM
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
Structural       |
  lonely     <-  |
          stigma |   .1884954   .0862113     2.19   0.029     .0195244    .3574664
           _cons |   33.01353   3.380739     9.77   0.000     26.38741    39.63966
  ---------------+----------------------------------------------------------------
  depression <-  |
      lonely     |   .5583737   .0644325     8.67   0.000     .4320883    .6846591
          stigma |   .1156213   .0620501     1.86   0.062    -.0059947    .2372373
           _cons |  -17.04876   3.196672    -5.33   0.000    -23.31412   -10.78339
-----------------+----------------------------------------------------------------
    var(e.lonely)|   105.0193   13.55794                      81.54168    135.2568
   var(e.depress)|   52.31918   6.754377                      40.62293    67.38303
----------------------------------------------------------------------------------
LR test of model vs. saturated: chi2(0)   =      0.00, Prob > chi2 =      .

Again, thank you for your help!

Kai-Yuan

Last edited by Kai-Yuan Cheng; 23 Dec 2019, 05:22.

Comment

Brian Flaherty

Join Date: Jun 2017

Posts: 11
#4

07 Jan 2020, 17:38

Hello Kai-Yuan,
Sorry for the delay replying. Yes, the variance of the exogenous variable is often simply its unconditional variance. But it is still estimated and still an element in your observed data covariance matrix that is being reproduced by the model. Let's say your exogenous variable was a treatment/control dummy variable with equal n's per group. In that case, the exogenous variance could be fixed to the known quantity, thereby saving a df. Hope this helps.
Brian
1 like
Comment
Kai-Yuan Cheng

Join Date: Apr 2017

Posts: 10
#5

30 Apr 2020, 09:18

Originally posted by Brian Flaherty View Post

Hello Kai-Yuan,
Sorry for the delay replying. Yes, the variance of the exogenous variable is often simply its unconditional variance. But it is still estimated and still an element in your observed data covariance matrix that is being reproduced by the model. Let's say your exogenous variable was a treatment/control dummy variable with equal n's per group. In that case, the exogenous variance could be fixed to the known quantity, thereby saving a df. Hope this helps.
Brian

Hi Brian,

I hope you have been well during this difficult time.
I am sorry for leaving this post unattended for some time. After a few days of inactivity, I have assumed that this post would not be replied and stopped checking it.
Thank you very much for helping with my follow-up question; revisiting this topic again now I seem to be more capable to understand your explanations!

All the bests,
Kai-Yuan
Comment

Announcement