sem: "too many latent variables" and r(503) error when fitting random intercept cross-lagged panel model

Andrew Mackinnon

Join Date: Jul 2021
Posts: 5

sem: "too many latent variables" and r(503) error when fitting random intercept cross-lagged panel model

17 Jan 2022, 15:27

Hi Everbody,

We are having difficulty using sem to replicate a random intercept cross-lagged panel model as shown in the diagrm.

Fitting this model in Mplus is documented at https://www.statmodel.com/download/R...er%20input.pdf and I believe we've accurately translated it as follows

Code:

* Example generated by -dataex-. To install: ssc install dataex
* Only 20 of 400 lines of data in original
clear
input float(ID X1 Y1 X2 Y2 X3 Y3 X4 Y4)
 1  7.556  8.836  7.237  8.976  7.813 10.573  8.654  9.791
 2  9.832  6.074 12.033  6.974 13.843  6.675 13.552   6.78
 3 11.574  5.543  12.21  6.066 13.818  5.607 12.461  3.904
 4  9.572  5.765  9.913  6.527  9.819  5.389 12.102   7.05
 5  6.699 11.792  8.008 12.095  8.858 10.958  8.385 11.479
 6 10.202  7.068 11.745  7.615  11.55  7.764  10.39   6.65
 7 12.905  6.635 11.714  5.858 12.443   7.33  13.34  4.719
 8  9.808  7.467   8.74  6.855  7.468  6.406 10.162   7.39
 9 10.084  5.412  9.812  5.964  8.625  5.267  8.519  6.744
10 14.275  7.618 14.877  7.326 16.421  6.721 14.217  7.913
11 11.509  7.906 11.375  6.923  9.139  5.764  8.821  7.508
12  10.13  5.101 10.383  5.624  8.128  3.421  8.869  4.957
13 11.571  6.774 12.731  5.784 11.737  3.953   9.78  4.553
14 11.362   7.32  11.18  7.379  7.747  6.044 10.846  6.971
15 10.271  9.462  9.797  8.513  9.025 10.731  9.755  9.987
16   10.9  7.941 10.187   7.31  10.43  7.235  9.025  6.642
17  9.928  8.421   9.55  9.777  9.393  9.998  9.988  9.161
18 10.519 11.779 11.605 10.345 10.696 10.892  9.438  9.804
19  7.707  8.072   8.15  6.945  9.852  8.658 10.089  8.924
20  9.638  7.312 11.258   8.58 10.531  7.583 10.719  8.345
end

sem /// 
(cX1@1 -> X1) (cX2@1 -> X2) (cX3@1 -> X3) (cX4@1 -> X4) /// Person-centred X
(cY1@1 -> Y1) (cY2@1 -> Y2) (cY3@1 -> Y3) (cY4@1 -> Y4) /// Person-centred Y
(RIx@1 -> X1) (RIx@1 -> X2) (RIx@1 -> X3) (RIx@1 -> X4) /// Random intercept X
(RIy@1 -> Y1) (RIy@1 -> Y2) (RIy@1 -> Y3) (RIy@1 -> Y4) /// Random intercept Y
 ///
(cX1 -> cX2) (cX2 -> cX3) (cX3 -> cX4) /// X auto regressive
(cY1 -> cY2) (cY2 -> cY3) (cY3 -> cY4) /// Y auto regressive
(cX1 -> cY2) (cX2 -> cY3) (cX3 -> cY4) /// X->Y cross loadings
(cY1 -> cX2) (cY2 -> cX3) (cY3 -> cX4) /// Y->X cross loadings
, covstruct(_lexogenous, diagonal) nocapslatent  ///
  latent(cX1 cX2 cX3 cX4 cY1 cY2 cY3 cY4 RIx RIy ) ///
  cov(RIx*cX1@0 RIx*cY1@0 RIy*cX1@0 RIy*cY1@0) /// 
  cov(e.Y1@0 e.Y2@0 e.Y3@0 e.Y4@0 e.X1@0 e.X2@0 e.X3@0 e.X4@0) ///
  cov(RIx*RIy cX1*cY1 e.cX2*e.cY2 e.cX3*e.cY3 e.cX4*e.cY4 )

The only output generated is:

Code:

Endogenous variables
Measurement:  X1 X2 X3 X4 Y1 Y2 Y3 Y4
Latent:       cX2 cX3 cX4 cY2 cY3 cY4

Exogenous variables
Latent:       cX1 cY1 RIx RIy
model not identified;
too many latent variables
r(503);

(r(503) is a matrix conformability error)

Although there are indeed many latent variables, many loadings and (co)variances are constrained so, if properly specified, the model is identified.

A reduced model without the random intercepts (RIx, RIy) runs and replicates the Mplus results with similar constraints.

If we've made a simple error, it would be great to know what our eyes have missed. If the problem is more obscure I'd be grateful for any pointers in tracking it down. I would like to be able to run estat framework but as estimation doesn't commence this isn't possible.

Thanks for any suggestions,

Andrew

Tags: None

Giovanni Piumatti

Join Date: Sep 2016

Posts: 14
#2

02 Mar 2022, 06:03

Hello,

Did you find a solution for this specific error message?

Giovanni
Comment
Mikko Rönkkö

Join Date: Apr 2015

Posts: 28
#3

30 Mar 2022, 05:16

This is a bug in Stata. The program apparently assumes that a model where the number of latent variables exceeds observed variables is not identified. Consider the following code.

clear
set obs 1000

gen x = rnormal()
gen z = rnormal()

sem (X -> x@1), ///
var(e.x@0)

sem (X -> x@1) ///
(X2 -> X@1), ///
var(e.x@0 e.X@0)

sem (X -> x@1) ///
(X2 -> X@1) ///
(z <- _cons), ///
var(e.x@0 e.X@0)

All these models are identified, but the second fails to estimate. The third model and the second model are basically the same but the third adds a variable z that is not used as a part of a model to trick Stata to estimate it. So the workaround is to just generate random variables that are included in the model as separate equations with just intercepts to be estimated. This way you can get the number of observed variables to exceed the number of latents and Stata will estimate the model.

Another option is to disable Stata's starting value algorithm using the noivstart option. Note that by doing this, you will probably need to set at least some of the starting values manually.

Mikko
Comment

Announcement

sem: "too many latent variables" and r(503) error when fitting random intercept cross-lagged panel model

Comment

Comment