SEM Model Identification-- fixing variances

Sule Yaylaci

Join Date: Jan 2018

Posts: 52
#1

SEM Model Identification-- fixing variances

20 May 2018, 18:27

Dear Statalist users, I am using Stata 14, and am working with cross-sectional data. I am trying to run Confirmatory Factor Analysis (CFA) on nine items using the 'sem' command. The items are 4-category ordinal variables. Items are called i1, i2, i3, ..i9, and the factors are called f1, f2 and f3. My goal is to fit a second-order model where a fourth latent variable, f4, may be an overarching construct (second-order factor) which f1, f2 and f3 loads strongly on. The command I use :

Code:

sem (f1-> i1 i2 i3) (f2-> i4 i5 i6 i7) (f3-> i7 i8 i9) (f4-> f1 f2 f3), latent (f1 f2 f3 f4) /// cov( e.f1@1 e.f2@1 e.f3@1 f4@1) nocapslatent difficult ml

This is to override Stata's default anchoring on some factor loadings. I would like to see the factor loadings; that's why I try to fix the variances at 1, but the model does not converge. Could you help me find out what I can do differently? Thanks, Sule
Tags: None
Richard Williams

Join Date: Apr 2014

Posts: 4949
#2

20 May 2018, 18:35

The first thing I would try is to estimate without overriding Stata's defaults, and see if it converges.

If you can provide an extract of your data using dataex that will reprodoce the problem, we may be better able to advise you.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Sule Yaylaci

Join Date: Jan 2018

Posts: 52
#3

20 May 2018, 18:45

Thanks for your prompt response, Richard. Yes, it does converge with Stata's defaults, very quickly indeed.
Stata by default constraints one item loading per each latent factor including the second-order factor. Estimation method is maximum likelihood. Number of obs=861.
Here is an example of the data.

Thanks in advance.

Code:

clear input float(i1 i2 i3 i4 i5 i6 i7 i8 i9) 1 4 1 1 1 1 1 1 1 4 1 3 1 2 2 2 3 4 1 4 1 1 1 2 1 1 1 2 3 3 1 3 1 3 3 1 2 4 3 2 2 2 2 3 4 2 1 2 2 3 3 2 2 3 . . . . . . . . . 2 3 4 3 1 1 3 2 2 2 3 4 2 2 2 1 2 3 1 1 1 1 1 1 1 1 1 1 3 1 1 2 1 2 4 4 end
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4949
#4

20 May 2018, 18:59

Show the sem code that worked. I want to see if the two models should be equivalent to each other or if there is some important difference besides the normalization.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Sule Yaylaci

Join Date: Jan 2018
Posts: 52

20 May 2018, 21:15

I am running it through the SEM builder, and here is the code that shows up in my command window:

Code:

 sem (f1-> i1,) (f1->i2,) (fi->i3,) (f2-> i4,) (f2-> i5,) (f2-> i6,) (f2-> i7,) (f3-> i7,) (f3-> i8,) ///
(f3-> i9,) (f4-> f1,) (f4->f2, ) (f4-> f3,), difficult nml latent (f1 f2 f3 f4) nocapslatent

Comment

Richard Williams

Join Date: Apr 2014

Posts: 4949
#6

21 May 2018, 07:15

Are you sure you are copying the code exactly? You have (fi->i3,), which I assume should be (f1->i3,). I also got the error

option nml not allowed

I think lower case l and the number 1 got confounded.

Also do you want f2 and f3 to both affect i7?

Unfortunately I can't get this to converge with the very small extract that was provided. I do notice that var(e.i1) = 0. Is this true in the full sample?

Make sure your posted code is correct, and consider providing a larger extract of 100 cases.

I have found that seemingly irrelevant changes in parameterization can affect whether or not Stata sem converges. If you've got something that works, I bet you could live with it. You might also see if the standardized option will give you what you want.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Sule Yaylaci

Join Date: Jan 2018
Posts: 52

21 May 2018, 09:55

It is my fault, Richard. I thought what dataex copied would be too long to paste here so I trimmed it. Below is a 100 case version.
I don't think var (e.i1)=0 in the full sample. And yes, I do want f2 and f3 to both affect i7.
The posted code is what appears in my syntax. I fixed the typos.

Code:

sem (f1-> i1,) (f1->i2,) (f1->i3,) (f2-> i4,) (f2-> i5,) (f2-> i6,) (f2-> i7,) (f3-> i7,) (f3-> i8,) ///
(f3-> i9,) (f4-> f1,) (f4->f2, ) (f4-> f3,), difficult latent (f1 f2 f3 f4) nocapslatent

The only thing I would want Stata to do is not to fix one of the loadings to the second order factor. Stata's constraints as shown in the output are : (1) [i1]f1=1 (2) [i4]f2=1 (3) [i7]f3=1 (4) [f1]f4=1 So, instead of the 4th constraints, what could I use? Thanks much for your guidance.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(i1 i2 i3 i4 i5 i6 i7 i8 i9)

1 1 1 1 4 1 1 1 1
. . . . . . . . .
. . . . . . . . .
1 1 3 4 3 3 3 2 3
1 1 2 2 4 3 4 2 2
4 3 2 2 3 4 4 4 1
3 1 3 2 3 3 4 4 3
3 1 2 2 3 3 4 2 2
3 3 2 1 3 3 2 4 2
. . . . . . . . .
. . . . . . . . .
3 1 2 4 1 3 4 2 2
1 1 1 1 4 1 1 1 2
3 1 3 2 3 3 1 3 1
3 2 2 2 4 3 4 2 2
2 2 2 2 1 2 3 3 3
. . . . . . . . .
3 2 2 1 3 2 2 2 1
2 2 1 1 3 1 1 1 4
2 2 2 3 4 3 4 2 2
3 1 3 2 3 2 3 2 2
4 1 3 2 4 2 4 2 3
3 3 2 3 4 4 4 2 2
4 3 3 2 3 2 2 1 1
4 2 1 2 3 2 3 2 2
1 1 1 1 1 1 1 1 1
1 1 2 1 3 4 4 2 1
2 1 1 3 4 4 2 2 2
3 2 3 2 3 3 2 4 2
3 2 4 3 3 3 4 2 2
1 1 1 3 4 3 2 1 1
3 2 2 3 2 2 2 1 1
1 1 1 1 4 1 1 1 1
3 1 3 1 4 3 4 2 2
2 1 1 1 1 1 2 2 1
. . . . . . . . .
3 3 2 2 2 2 2 2 2
1 1 1 2 3 2 1 2 1
2 1 1 2 3 3 4 4 2
3 3 3 3 3 1 3 3 2
2 1 3 2 4 3 3 2 2
3 3 4 4 4 4 1 3 2
2 1 2 2 3 2 2 2 2
3 2 1 3 3 3 2 1 2
4 1 2 3 1 3 2 2 1
1 1 1 4 3 1 4 4 3
3 3 1 3 3 3 2 4 1
3 1 2 2 3 3 4 2 2
3 1 2 1 3 3 4 1 2
3 3 1 3 3 3 2 3 2
3 1 2 2 3 3 3 2 2
3 1 1 2 2 3 4 4 2
4 1 2 2 3 3 1 2 2
2 1 1 4 4 1 3 4 1
3 1 1 3 3 3 4 3 1
3 2 4 2 4 3 4 4 1
2 1 2 2 2 3 3 2 2
. . . . . . . . .
3 2 3 3 3 3 2 3 3
3 3 3 1 4 4 4 1 2
. . . . . . . . .
2 1 3 3 4 4 2 4 2
4 3 2 3 3 3 3 2 4
3 1 3 3 4 2 4 2 2
4 3 1 2 2 2 3 4 4
1 1 1 1 4 1 1 1 1
1 1 1 2 4 4 2 3 2
2 2 2 3 3 3 2 1 1
1 1 3 2 1 4 3 4 3
4 1 4 3 4 4 4 4 1
3 3 1 3 3 3 4 4 2
4 3 4 3 3 3 2 3 2
. . . . . . . . .
2 2 2 3 4 3 3 1 2
2 2 2 3 3 3 3 3 2
3 1 2 3 3 3 2 1 3
. . . . . . . . .
3 3 1 1 4 3 1 1 1
1 1 1 2 2 3 1 1 3
2 1 3 2 3 3 2 2 2
3 3 3 3 3 3 2 4 2
3 2 1 3 4 2 2 2 2
3 1 2 3 4 4 1 1 1
3 1 2 2 3 3 4 3 2
3 3 3 3 3 2 4 4 1
3 2 1 3 3 3 3 4 3
1 1 2 2 3 2 1 2 1
4 2 1 3 4 4 1 1 2
2 1 3 2 4 3 3 3 2
1 1 2 1 2 1 3 1 1
3 1 2 2 3 3 4 1 1
3 3 3 3 4 3 2 3 1
3 3 4 2 4 3 4 1 2
4 1 1 1 4 1 1 1 2
2 2 2 3 3 3 3 2 2
2 1 1 1 4 3 2 1 1
1 1 2 3 3 3 3 2 1
3 1 2 2 4 3 2 2 2
3 3 1 4 4 4 1 2 2
2 1 1 2 2 2 4 2 1
end

Comment

Richard Williams

Join Date: Apr 2014

Posts: 4949
#8

21 May 2018, 11:02

I would think that this would be the command:

Code:

sem (f1-> i1,) (f1->i2,) (f1->i3,) (f2-> i4,) (f2-> i5,) (f2-> i6,) (f2-> i7,) (f3-> i7,) (f3-> i8,) /// (f3-> i9,) (f4-> f1 f2 f3,), difficult latent (f1 f2 f3 f4) nocapslatent var(f4@1) iter(30)

But, it doesn't converge. Further, it says the paths from f4 to f1, f2, and f3 are all constrained to be zero. But they aren't. They should be free.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Sule Yaylaci

Join Date: Jan 2018

Posts: 52
#9

21 May 2018, 11:13

Thanks much for your time on this, Richard. I will just report the standardized estimates with latent factors anchored to one of the items, including the second-order one, as it is the only model that appears to converge without problems.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4949
#10

21 May 2018, 11:36

If I was ambitious I would try this in mplus or R. I've had other models where I fixed the latent variance at 1. Maybe the 2nd order factor is confusing it because it doesn't have any observed indicators.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Sule Yaylaci

Join Date: Jan 2018

Posts: 52
#11

21 May 2018, 12:07

I think part of the problem is that I have the bare minimum number of indicators(3) for each latent variable and a complex variable that loads onto multiple factors.
If I had more items, it may have been possible to fix the latent variance at 1. Thanks again!
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4949
#12

21 May 2018, 12:11

It should be six of one or half a dozen of the other. Fixing an indicator or fixing a variance should work equally well. I’ll maybe try it in mplus when I get a chance.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 686

#13

21 May 2018, 15:14

Richard Williams point is well taken; however, getting good starting values are tricky for models like this.

One way to get the requested model fit is to use sem with the default loading contraints, then refit with the model of interest using a transformation of the originally fitted parameter estimates.

In this case, we simply need to find all the loadings for f4, multiply them by the square root of the variance estimate of f4.

Here is how I did this using the data and model specified above.

Code:

sem     (f1 -> i1,)     ///
        (f1 -> i2,)     ///
        (f1 -> i3,)     ///
        (f2 -> i4,)     ///
        (f2 -> i5,)     ///
        (f2 -> i6,)     ///
        (f2 -> i7,)     ///
        (f3 -> i7,)     ///
        (f3 -> i8,)     ///
        (f3 -> i9,)     ///
        (f4 -> f1,)     ///
        (f4 -> f2,)     ///
        (f4 -> f3,)     ///
        ,               ///
        difficult latent(f1 f2 f3 f4) nocapslatent

matrix b = e(b)
local var_pos = colnumb(b,"/var(f4)")
scalar sd_f4 = sqrt(b[1,`var_pos'])
matrix b[1,`var_pos'] = 1
_ms_eq_info
local k_eq = r(k_eq)
forval i = 1/`k_eq' {
        local j = colnumb(b,"`r(eq`i')':f4")
        if `j' != . {
                matrix b[1,`j'] = b[1,`j'] * scalar(sd_f4)
        }
}
matrix cmp = b', e(b)'
matrix list cmp

sem     (f1 -> i1,)     ///
        (f1 -> i2,)     ///
        (f1 -> i3,)     ///
        (f2 -> i4,)     ///
        (f2 -> i5,)     ///
        (f2 -> i6,)     ///
        (f2 -> i7,)     ///
        (f3 -> i7,)     ///
        (f3 -> i8,)     ///
        (f3 -> i9,)     ///
        (f4 -> f1,)     ///
        (f4 -> f2,)     ///
        (f4 -> f3,)     ///
        ,               ///
        from(b)         ///
        var(f4@1)       ///
        difficult latent(f1 f2 f3 f4) nocapslatent

Last edited by Jeff Pitblado (StataCorp); 21 May 2018, 15:16.

Comment

Sule Yaylaci

Join Date: Jan 2018

Posts: 52
#14

21 May 2018, 17:07

Thanks much for your reply, Jeff. This looks very promising. Much appreciated!
Comment

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 686

#15

22 May 2018, 16:54

I figured out a shorter work-around to get sem to fit this model. Just give f4 a non-zero loading on one of it's paths.

Code:

sem     (f1 -> i1,)     ///
        (f1 -> i2,)     ///
        (f1 -> i3,)     ///
        (f2 -> i4,)     ///
        (f2 -> i5,)     ///
        (f2 -> i6,)     ///
        (f2 -> i7,)     ///
        (f3 -> i7,)     ///
        (f3 -> i8,)     ///
        (f3 -> i9,)     ///
        (f4 -> f1,)     ///
        (f4 -> f2,)     ///
        (f4 -> f3,)     ///
        ,               ///
        from(f1:f4=1)   ///  <-- replaced b vector
        var(f4@1)       ///
        difficult latent(f1 f2 f3 f4) nocapslatent

We hope to improve sem to do this automatically in a future update.

Announcement