Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • sem: "too many latent variables" and r(503) error when fitting random intercept cross-lagged panel model

    Hi Everbody,

    We are having difficulty using sem to replicate a random intercept cross-lagged panel model as shown in the diagrm.

    RI-CLPM.png


    Fitting this model in Mplus is documented at https://www.statmodel.com/download/R...er%20input.pdf and I believe we've accurately translated it as follows

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    * Only 20 of 400 lines of data in original
    clear
    input float(ID X1 Y1 X2 Y2 X3 Y3 X4 Y4)
     1  7.556  8.836  7.237  8.976  7.813 10.573  8.654  9.791
     2  9.832  6.074 12.033  6.974 13.843  6.675 13.552   6.78
     3 11.574  5.543  12.21  6.066 13.818  5.607 12.461  3.904
     4  9.572  5.765  9.913  6.527  9.819  5.389 12.102   7.05
     5  6.699 11.792  8.008 12.095  8.858 10.958  8.385 11.479
     6 10.202  7.068 11.745  7.615  11.55  7.764  10.39   6.65
     7 12.905  6.635 11.714  5.858 12.443   7.33  13.34  4.719
     8  9.808  7.467   8.74  6.855  7.468  6.406 10.162   7.39
     9 10.084  5.412  9.812  5.964  8.625  5.267  8.519  6.744
    10 14.275  7.618 14.877  7.326 16.421  6.721 14.217  7.913
    11 11.509  7.906 11.375  6.923  9.139  5.764  8.821  7.508
    12  10.13  5.101 10.383  5.624  8.128  3.421  8.869  4.957
    13 11.571  6.774 12.731  5.784 11.737  3.953   9.78  4.553
    14 11.362   7.32  11.18  7.379  7.747  6.044 10.846  6.971
    15 10.271  9.462  9.797  8.513  9.025 10.731  9.755  9.987
    16   10.9  7.941 10.187   7.31  10.43  7.235  9.025  6.642
    17  9.928  8.421   9.55  9.777  9.393  9.998  9.988  9.161
    18 10.519 11.779 11.605 10.345 10.696 10.892  9.438  9.804
    19  7.707  8.072   8.15  6.945  9.852  8.658 10.089  8.924
    20  9.638  7.312 11.258   8.58 10.531  7.583 10.719  8.345
    end
    
    sem /// 
    (cX1@1 -> X1) (cX2@1 -> X2) (cX3@1 -> X3) (cX4@1 -> X4) /// Person-centred X
    (cY1@1 -> Y1) (cY2@1 -> Y2) (cY3@1 -> Y3) (cY4@1 -> Y4) /// Person-centred Y
    (RIx@1 -> X1) (RIx@1 -> X2) (RIx@1 -> X3) (RIx@1 -> X4) /// Random intercept X
    (RIy@1 -> Y1) (RIy@1 -> Y2) (RIy@1 -> Y3) (RIy@1 -> Y4) /// Random intercept Y
     ///
    (cX1 -> cX2) (cX2 -> cX3) (cX3 -> cX4) /// X auto regressive
    (cY1 -> cY2) (cY2 -> cY3) (cY3 -> cY4) /// Y auto regressive
    (cX1 -> cY2) (cX2 -> cY3) (cX3 -> cY4) /// X->Y cross loadings
    (cY1 -> cX2) (cY2 -> cX3) (cY3 -> cX4) /// Y->X cross loadings
    , covstruct(_lexogenous, diagonal) nocapslatent  ///
      latent(cX1 cX2 cX3 cX4 cY1 cY2 cY3 cY4 RIx RIy ) ///
      cov(RIx*cX1@0 RIx*cY1@0 RIy*cX1@0 RIy*cY1@0) /// 
      cov(e.Y1@0 e.Y2@0 e.Y3@0 e.Y4@0 e.X1@0 e.X2@0 e.X3@0 e.X4@0) ///
      cov(RIx*RIy cX1*cY1 e.cX2*e.cY2 e.cX3*e.cY3 e.cX4*e.cY4 )
    The only output generated is:

    Code:
    Endogenous variables
    Measurement:  X1 X2 X3 X4 Y1 Y2 Y3 Y4
    Latent:       cX2 cX3 cX4 cY2 cY3 cY4
    
    Exogenous variables
    Latent:       cX1 cY1 RIx RIy
    model not identified;
    too many latent variables
    r(503);
    (r(503) is a matrix conformability error)

    Although there are indeed many latent variables, many loadings and (co)variances are constrained so, if properly specified, the model is identified.

    A reduced model without the random intercepts (RIx, RIy) runs and replicates the Mplus results with similar constraints.

    If we've made a simple error, it would be great to know what our eyes have missed. If the problem is more obscure I'd be grateful for any pointers in tracking it down. I would like to be able to run estat framework but as estimation doesn't commence this isn't possible.

    Thanks for any suggestions,

    Andrew

  • #2
    Hello,

    Did you find a solution for this specific error message?

    Giovanni

    Comment


    • #3
      This is a bug in Stata. The program apparently assumes that a model where the number of latent variables exceeds observed variables is not identified. Consider the following code.

      clear
      set obs 1000

      gen x = rnormal()
      gen z = rnormal()

      sem (X -> x@1), ///
      var(e.x@0)

      sem (X -> x@1) ///
      (X2 -> X@1), ///
      var(e.x@0 e.X@0)

      sem (X -> x@1) ///
      (X2 -> X@1) ///
      (z <- _cons), ///
      var(e.x@0 e.X@0)

      All these models are identified, but the second fails to estimate. The third model and the second model are basically the same but the third adds a variable z that is not used as a part of a model to trick Stata to estimate it. So the workaround is to just generate random variables that are included in the model as separate equations with just intercepts to be estimated. This way you can get the number of observed variables to exceed the number of latents and Stata will estimate the model.

      Another option is to disable Stata's starting value algorithm using the noivstart option. Note that by doing this, you will probably need to set at least some of the starting values manually.

      Mikko

      Comment

      Working...
      X