SEM Estimation Failed (convergence not achieved)

Hadji Jalotjot

Join Date: Mar 2023

Posts: 37
#1

SEM Estimation Failed (convergence not achieved)

30 Nov 2023, 21:23

I'm trying to estimate an SEM model (first time in my life)

I used 5-point Likert Scale questions as indicator for latent valiables Socialization, Externalization, Combination, and Internalization (SECI)
These SECI latent variable are then used as indicator for another latent variable Ability

I've trying to estiimate this seemingly simple model to no success....It seems I have an identification problem but my manifest variables are quite large relative to the free parameters.

Code:

sem (Socialization -> Social_3, ) (Socialization -> Social_4, ) (Socialization -> Social_6, ) (Extnalization -> External_1, ) (Extnalization -> External_2, ) (Extnalization -> External_3, ) (Combination -> Combination_1, ) (Combination -> Combination_2, ) (Combination -> Combination_3, ) (Combination -> Combination_6, ) (Internalization -> Internal_1, ) (Internalization -> Internal_2, ) (Internalization -> Internal_4, ) (Internalization -> Internal_6, ) (Ability -> Socialization, ) (Ability -> Extnalization, ) (Ability -> Combination, ) (Ability -> Internalization, ), latent(Socialization Extnalization Combination Internalization Ability ) nocapslatent

The final message is

Warning: The LR test of model vs. saturated is not reported because the
fitted model is not full rank. There appears to be 7 more fitted
parameters than the data can support.
convergence not achieved

Any help is greatly appreciated.
Tags: None
Shen YANG

Join Date: Apr 2023

Posts: 41
#2

30 Nov 2023, 22:37

Perhaps:
LR test not reported: it indicates that the model has too many parameters relative to the amount of data available. In SEM, this situation can arise when the model is overly complex (i.e., it has too many parameters to estimate). The model might be overfitting the data.

Convergence not achieved: it means that the process of estimating the model did not successfully converge on a solution.
Comment
Hadji Jalotjot

Join Date: Mar 2023

Posts: 37
#3

30 Nov 2023, 23:22

Originally posted by Shen YANG View Post

Perhaps:

LR test not reported: it indicates that the model has too many parameters relative to the amount of data available. In SEM, this situation can arise when the model is overly complex (i.e., it has too many parameters to estimate). The model might be overfitting the data.

Convergence not achieved: it means that the process of estimating the model did not successfully converge on a solution.

About you #1 point, I assume this pertains to identification... I still dont get how it has "too many parameters" relative to the data available as the number of indicator variables is quite many and consequently the knowns p=k(k+1)/2) relative to the free parameters.
Am I missing one or two important details in this analysis?

Please help. Thanks!
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#4

01 Dec 2023, 02:17

Originally posted by Hadji Jalotjot View Post

Am I missing one or two important details in this analysis?

Possibly. Try fitting the model stepwise, successively looking for problematic signs.

First, try fitting each of the component CFA models alone. Perhaps there is one that is such a poorly fitting model that it won't converge on its own and is bringing the joint model down.

If there's no problem at that step, then try fitting them all at once but not jointly, that is, remove the second-order latent factor and impose an independent covariance structure on the four first-order latent factors, using the covstructure(_LEx, diagonal) option.

If you get convergence there (you ought to if they all converged individually), then take the e(b) vector of coefficients as the set of starting values (assign the vector to a Stata matrix and feed it forward via the from() option) for the final model with the second-order latent factor added back in.
1 like
Comment
Hadji Jalotjot

Join Date: Mar 2023

Posts: 37
#5

01 Dec 2023, 05:55

Originally posted by Joseph Coveney View Post

Possibly. Try fitting the model stepwise, successively looking for problematic signs.

First, try fitting each of the component CFA models alone. Perhaps there is one that is such a poorly fitting model that it won't converge on its own and is bringing the joint model down.

If there's no problem at that step, then try fitting them all at once but not jointly, that is, remove the second-order latent factor and impose an independent covariance structure on the four first-order latent factors, using the covstructure(_LEx, diagonal) option.

If you get convergence there (you ought to if they all converged individually), then take the e(b) vector of coefficients as the set of starting values (assign the vector to a Stata matrix and feed it forward via the from() option) for the final model with the second-order latent factor added back in.

Thanks Joseph. I'll try this.

I actually already fitted the each CFA models separately. I got some good rsultss actually. When I try to fit it with the second order latent variable, then that;s when non convergence occurs.

When you say "try fitting them all at once but not jointly", do you mean all the CFA models are there (I use the SEM builder) but there will be no straight arrow or curved arrows (factor loadings and covariance?) connecting each model?
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#6

01 Dec 2023, 06:26

Originally posted by Hadji Jalotjot View Post

When you say "try fitting them all at once but not jointly", do you mean all the CFA models are there (I use the SEM builder) but there will be no straight arrow or curved arrows (factor loadings and covariance?) connecting each model?

Yes.
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 429
#7

01 Dec 2023, 07:55

Joseph has got you covered on how to diagnose problems. An additional issue to consider is the dimensionality of your items. Have you put all of them into an exploratory factor analysis? How many latent factors are retained?

You say you have Likert-scaled items but are using sem, which treats the items as continuous via an identity link. Have you tried running the model in gsem, using an ordinal link? There are lots of ways things can go wrong in measurement models.
Comment
Hadji Jalotjot

Join Date: Mar 2023

Posts: 37
#8

01 Dec 2023, 16:14

Originally posted by Erik Ruzek View Post

Joseph has got you covered on how to diagnose problems. An additional issue to consider is the dimensionality of your items. Have you put all of them into an exploratory factor analysis? How many latent factors are retained?

You say you have Likert-scaled items but are using sem, which treats the items as continuous via an identity link. Have you tried running the model in gsem, using an ordinal link? There are lots of ways things can go wrong in measurement models.

I adopted the Likert questions from a study so I figured EFA wont be necessary. I did run a CFA...3 of 4 models are quite good..the other isnt that good fit...

I tried GSEM making the likert scale data as ordinal/logit...the outcome is the same...it was converging when fitting the first-order laten variables but once I add the second-order latent variable Ability, it wont converge...

BTW, what does this mean "There appears to be 7 more fitted parameters than the data can support." --> My understanding is that I dont have enough data to estimate all the parameters, i.e. my mode is unidentified....this is where I;m really confused.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4994
#9

01 Dec 2023, 17:37

How is ability itself being identified? I would think one of the paths emanating out from Ability would need to be fixed at 1, unless maybe Stata automatically fixes its variance at 1. The model as shown is not identified.

Last edited by Richard Williams; 01 Dec 2023, 17:54.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Joseph Coveney

Join Date: Apr 2014
Posts: 4410

#10

02 Dec 2023, 00:08

Originally posted by Richard Williams View Post

I would think one of the paths emanating out from Ability would need to be fixed at 1, unless maybe Stata automatically fixes its variance at 1.

It does, at least it does when the model is fitted from command-line code, as below.

Code:

version 18.0

log close _all
log using "Second-order CFA.smcl", nomsg name(lo)

clear *

// seedem
set seed 251035945

quietly set obs 250
generate double abl = rnormal()

tempname Corr
foreach set in "soc3 soc4 soc6" "ext1 ext2 ext3" "com1 com2 com3 com6" ///
    "int1 int2 int4 int6" {
        local dim : word count `set'
        matrix define `Corr' = J(`dim', `dim', 0.5) + I(`dim') * 0.5
        quietly drawnorm `set', double corr(`Corr')
}

tempvar cat
foreach var of varlist soc? ext? com? int? {
    generate byte `cat' = 1
    forvalues cut = 1/4 {
        quietly replace `cat' = `cat' + 1 if (abl + `var') > invnormal(`cut' / 5)
    }
    drop `var'
    rename `cat' `var'
}

*
* Begin here
*
sem ///
    (soc? <- Socialization) ///
    (ext? <- Externalization) ///
    (com? <- Combination) ///
    (int? <- Internalization) ///
    (Socialization Externalization Combination Internalization <- Ability), ///
        nocnsreport nodescribe nofootnote nolog

log close lo

exit

As shown in the attached log-file, Stata automatically fixes one of the first-order latent factor's factor loading onto Ability to one.

The model as shown is not identified.

Yeah, what is that shown in the OP's diagram? Are those numbers the coefficients of the partially convergent model?

If those 4.3s and 4.4s in the boxes of the indicator variables are the intercepts (means), then there might be substantial ceiling effects to contend with.

Attached Files

Second-order CFA.smcl (12.4 KB, 1 view)

Comment

Richard Williams

Join Date: Apr 2014

Posts: 4994
#11

02 Dec 2023, 13:48

Joe wrote code, whereas it appears the OP used the Sem Builder to create the model. I'm not sure if that makes a difference, but in the original diagram NONE of the paths emanating from Ability equal 1; indeed, you don't see values for any of the paths. So, I would explicitly force the path from Ability to Socialization to equal one or fix the variance of Ability to 1.

Sometimes you might have problems if the variables are in radically different scales, e.g. a var that runs from zero to one and another that runs into the millions. If so, analyzing the correlation matrix or standarizing or rescaling the observed variables first might help.

If we actually saw the Stata code and output, we might be able to advise you better. If the model is correctly coded, I am not sure why you would have problems, unless maybe there are major problems with the scaling of variables or other data problems.

Last edited by Richard Williams; 02 Dec 2023, 14:28.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Hadji Jalotjot

Join Date: Mar 2023

Posts: 37
#12

02 Dec 2023, 22:43

Originally posted by Richard Williams View Post

How is ability itself being identified? I would think one of the paths emanating out from Ability would need to be fixed at 1, unless maybe Stata automatically fixes its variance at 1. The model as shown is not identified.

I'll run it again with one of the paths fixed at 1. BTW the values whon there are from the non convergent fitting I did.

Richard, can you enlighten me when you say "The model as shown is not identified."..I've been reading on identification and in my mind this model is over-identified. This is my first time using SEM and any help is appreciated. Thanks!
Comment

Hadji Jalotjot

Join Date: Mar 2023
Posts: 37

#13

02 Dec 2023, 22:48

Originally posted by Joseph Coveney View Post

It does, at least it does when the model is fitted from command-line code, as below.

Code:

version 18.0

log close _all
log using "Second-order CFA.smcl", nomsg name(lo)

clear *

// seedem
set seed 251035945

quietly set obs 250
generate double abl = rnormal()

tempname Corr
foreach set in "soc3 soc4 soc6" "ext1 ext2 ext3" "com1 com2 com3 com6" ///
"int1 int2 int4 int6" {
local dim : word count `set'
matrix define `Corr' = J(`dim', `dim', 0.5) + I(`dim') * 0.5
quietly drawnorm `set', double corr(`Corr')
}

tempvar cat
foreach var of varlist soc? ext? com? int? {
generate byte `cat' = 1
forvalues cut = 1/4 {
quietly replace `cat' = `cat' + 1 if (abl + `var') > invnormal(`cut' / 5)
}
drop `var'
rename `cat' `var'
}

*
* Begin here
*
sem ///
(soc? <- Socialization) ///
(ext? <- Externalization) ///
(com? <- Combination) ///
(int? <- Internalization) ///
(Socialization Externalization Combination Internalization <- Ability), ///
nocnsreport nodescribe nofootnote nolog

log close lo

exit

As shown in the attached log-file, Stata automatically fixes one of the first-order latent factor's factor loading onto Ability to one.

Yeah, what is that shown in the OP's diagram? Are those numbers the coefficients of the partially convergent model?

If those 4.3s and 4.4s in the boxes of the indicator variables are the intercepts (means), then there might be substantial ceiling effects to contend with.

Yes those are coefficients of the partially convergent model.

"then there might be substantial ceiling effects to contend with" --> So I am running into another problem, then? 😂

Comment

Hadji Jalotjot

Join Date: Mar 2023

Posts: 37
#14

02 Dec 2023, 22:49

Originally posted by Richard Williams View Post

Joe wrote code, whereas it appears the OP used the Sem Builder to create the model. I'm not sure if that makes a difference, but in the original diagram NONE of the paths emanating from Ability equal 1; indeed, you don't see values for any of the paths. So, I would explicitly force the path from Ability to Socialization to equal one or fix the variance of Ability to 1.

Sometimes you might have problems if the variables are in radically different scales, e.g. a var that runs from zero to one and another that runs into the millions. If so, analyzing the correlation matrix or standarizing or rescaling the observed variables first might help.

If we actually saw the Stata code and output, we might be able to advise you better. If the model is correctly coded, I am not sure why you would have problems, unless maybe there are major problems with the scaling of variables or other data problems.

I'll run it again and see if it will the issues and post the results here. Thanks!
Comment

Hadji Jalotjot

Join Date: Mar 2023
Posts: 37

#15

02 Dec 2023, 22:55

I used SEM builder to fit the model. I fixed one path from Ability to 1.

Code:

sem (Socialization -> Social_3, ) (Socialization -> Social_4, ) (Socialization -> Social_6, )
(Extnalization -> External_1, ) (Extnalization -> External_2, ) (Extnalization -> External_3, )
(Combination -> Combination_1, ) (Combination -> Combination_2, ) (Combination -> Combination_3, ) (Combination -> Combination_6, )
(Internalization -> Internal_1, ) (Internalization -> Internal_2, ) (Internalization -> Internal_4, ) (Internalization -> Internal_6, )
(Ability@1 -> Socialization, ) (Ability -> Extnalization, ) (Ability -> Combination, ) (Ability -> Internalization, ),
latent(Socialization Extnalization Combination Internalization Ability ) nocapslatent

The results are...

Announcement