GSEM sometimes breaks single equations into two equations. Bug?

Dan Obst

Join Date: Jan 2025
Posts: 3

GSEM sometimes breaks single equations into two equations. Bug?

20 Jan 2025, 10:14

Hi!

I’m using StataNow/SE 18.5 for Mac (Intel 64-bit).

GSEM sometimes breaks single equations into two equations. Is this a bug?

It leads to follow up issues with esttab (version 2.1.1 )

Example:

Code:

gsem (treat -> PerceivedInequality) ///
(PerceivedInequality -> manipulationcheck1 manipulationcheck2) ///
(worries <- PerceivedInequality treat) ///
, nocapslatent latent(PerceivedInequality)


--------------------------------------------------------------------------------------------
                           | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
---------------------------+----------------------------------------------------------------
manipulationcheck1         |
       PerceivedInequality |          1  (constrained)
                     _cons |   6.063231   .0815965    74.31   0.000     5.903305    6.223157
---------------------------+----------------------------------------------------------------
manipulationcheck2         |
       PerceivedInequality |   .5551072   .0952481     5.83   0.000     .3684244      .74179
                     _cons |   5.641399   .0936912    60.21   0.000     5.457768     5.82503
---------------------------+----------------------------------------------------------------
worries                    |
                     treat |    .309179   .1616929     1.91   0.056    -.0077333    .6260914
---------------------------+----------------------------------------------------------------
PerceivedInequality        |
                     treat |   1.210984   .1150321    10.53   0.000     .9855248    1.436442
---------------------------+----------------------------------------------------------------
worries                    |
       PerceivedInequality |   .2787906   .0660852     4.22   0.000      .149266    .4083151
                     _cons |   4.715864   .1005135    46.92   0.000     4.518861    4.912867
---------------------------+----------------------------------------------------------------
 var(e.PerceivedInequality)|   2.437964   .4740489                       1.66539    3.568937
---------------------------+----------------------------------------------------------------
  var(e.manipulationcheck1)|   .7888583   .4547422                      .2548699    2.441627
  var(e.manipulationcheck2)|   3.553667    .215075                       3.15617    4.001227
             var(e.worries)|   4.693476   .2185726                      4.284051    5.142029
--------------------------------------------------------------------------------------------

See how worries shows up twice?

This seems to lead to further issues with esttab. While this code works:

Code:

estout model3

----------------------------
                      (1)  
             manipulati~1  
----------------------------
worries                    
treat               0.309  
                   (1.91)  

PerceivedI~y        0.279***
                   (4.22)  

_cons               4.716***
                  (46.92)  
----------------------------
N                     991  
----------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

This code leads to issues:

Code:

esttab model3 model4, keep(worries:) drop(HNE1_other age sex2)

--------------------------------------------
                      (1)             (2)  
             manipulati~1    manipulati~1  
--------------------------------------------
worries                                    
treat               0.309           0.332  
                   (1.91)          (1.91)  

treat               0.309           0.332  
                   (1.91)          (1.91)  
--------------------------------------------
N                     991             874  
--------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

See how the coefficients of PerceivedInequality and the constant is missing while treat shows up twice?

I can’t even run the code

Code:

esttab model3 model4, keep(worries:) drop(_cons HNE1_other age sex2)
coefficient _cons not found
r(111);

While this code works fine:

Code:

esttab model3, keep(worries:) drop(_cons)

----------------------------
                      (1)  
             manipulati~1  
----------------------------
worries                    
treat               0.309  
                   (1.91)  

PerceivedI~y        0.279***
                   (4.22)  
----------------------------
N                     991  
----------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

Any idea on what is going on here?

Last edited by Dan Obst; 20 Jan 2025, 10:18.

Tags: None

Richard Williams

Join Date: Apr 2014

Posts: 4870
#2

20 Jan 2025, 16:17

A replicable example would help. Without one, my guess is it is because worries is affected by both a latent variable and an observed variable. I think your code could be rewritten as

Code:

gsem (treat -> PerceivedInequality) /// (PerceivedInequality -> manipulationcheck1 manipulationcheck2 treat) /// (worries <- PerceivedInequality) /// , nocapslatent latent(PerceivedInequality)

The output you are getting seems consistent with that.

If you can provide a replicable example (even if it is with fake data) it may be possible to come up with a better answer.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 18.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Dan Obst

Join Date: Jan 2025
Posts: 3

20 Jan 2025, 23:59

I’m not sure about the rewritten code because now it is a not allowed nonrecursive system.

Look at this replicable example. From my understanding #1 - #4 should result in equal outputs:

Code:

sysuse auto
// #1 splits trunk
gsem (foreign -> Size) ///
(Size -> length headroom) ///
(trunk <- Size foreign) ///
, nocapslatent latent(Size)

// #2 works
gsem (Size <- foreign) ///
(Size -> length headroom) ///
(trunk <- Size foreign) ///
, nocapslatent latent(Size)

// #3 works
gsem (Size -> length headroom) ///
(Size <- foreign) ///
(trunk <- Size foreign) ///
, nocapslatent latent(Size)

// #4 works
gsem (Size -> length headroom) ///
(foreign -> Size) ///
(trunk <- Size foreign) ///
, nocapslatent latent(Size)

Output:

Code:

// #1
---------------------------------------------------------------------------------
                | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
----------------+----------------------------------------------------------------
length          |
           Size |          1  (constrained)
          _cons |   195.2516    2.62974    74.25   0.000     190.0974    200.4058
----------------+----------------------------------------------------------------
headroom        |
           Size |   .0308984   .0051064     6.05   0.000     .0208901    .0409068
          _cons |   3.219393   .1007595    31.95   0.000     3.021908    3.416878
----------------+----------------------------------------------------------------
trunk           |
        foreign |   3.821653   1.607966     2.38   0.017     .6700972     6.97321
----------------+----------------------------------------------------------------
Size            |
        foreign |  -24.61892   4.777411    -5.15   0.000    -33.98248   -15.25537
----------------+----------------------------------------------------------------
trunk           |
           Size |   .2909373   .0583336     4.99   0.000     .1766055    .4052691
          _cons |      14.75   .5497751    26.83   0.000     13.67246    15.82754
----------------+----------------------------------------------------------------
     var(e.Size)|    183.886   57.51064                      99.61717    339.4403
----------------+----------------------------------------------------------------
   var(e.length)|   178.5844   44.50906                      109.5712    291.0655
 var(e.headroom)|   .4095916   .0745267                      .2867291    .5851003
    var(e.trunk)|   .1521919   2.760398                      5.54e-17    4.18e+14
---------------------------------------------------------------------------------



// #2 #3 #4

---------------------------------------------------------------------------------
                | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
----------------+----------------------------------------------------------------
length          |
           Size |          1  (constrained)
          _cons |   195.2516    2.62974    74.25   0.000     190.0974    200.4058
----------------+----------------------------------------------------------------
headroom        |
           Size |   .0308984   .0051064     6.05   0.000     .0208901    .0409068
          _cons |   3.219393   .1007595    31.95   0.000     3.021908    3.416878
----------------+----------------------------------------------------------------
trunk           |
        foreign |   3.821653   1.607966     2.38   0.017     .6700972     6.97321
           Size |   .2909373   .0583336     4.99   0.000     .1766055    .4052691
          _cons |      14.75   .5497751    26.83   0.000     13.67246    15.82754
----------------+----------------------------------------------------------------
Size            |
        foreign |  -24.61892   4.777411    -5.15   0.000    -33.98248   -15.25537
----------------+----------------------------------------------------------------
     var(e.Size)|    183.886   57.51064                      99.61717    339.4403
----------------+----------------------------------------------------------------
   var(e.length)|   178.5844   44.50906                      109.5712    291.0655
 var(e.headroom)|   .4095916   .0745267                      .2867291    .5851003
    var(e.trunk)|   .1521919   2.760398                      5.54e-17    4.18e+14
---------------------------------------------------------------------------------

Last edited by Dan Obst; 21 Jan 2025, 00:03.

Comment

Richard Williams

Join Date: Apr 2014

Posts: 4870
#4

21 Jan 2025, 00:48

I think my last answer is a bit off. Try this instead:

Code:

gsem (treat -> PerceivedInequality worries) /// (PerceivedInequality -> manipulationcheck1@1 manipulationcheck2 worries) /// , nocapslatent latent(PerceivedInequality)

With your code, I think Stata may be first treating worries as one of the 3 observed indicators of the latent variable, and does an equation for that. But, then it sees that treat also affects worries, and does an equation for that. The way I tweaked the code, the full equation for worries gets done first.

This works with fake data I created. If you run it with your data, you hopefully get the same numbers as before, albeit with the equations in a different order.

If still not working the way you want, try posting a replicable example.

Incidentally, I agree that Stata should have just given you the equations you want. But your model may be a bit odd, in that an observed indicator of a latent variable is also affected by another observed variable. Back when I was using LISREL way back when, I don't think you could have specified a model like this.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 18.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Richard Williams

Join Date: Apr 2014
Posts: 4870

21 Jan 2025, 01:42

On Post #3 (which I did not see earlier) --

It seems bizarre to me that #1 does not do the equations the way you want. But, using the strategy in #4, putting the trunk equation first,

Code:

clear all
sysuse auto
// # 5 works
gsem (trunk <- Size foreign) ///
(foreign -> Size) ///
(Size -> length@1 headroom) ///
, nocapslatent latent(Size)

You again get what you want:get

Code:

Log likelihood = -576.27026

 ( 1)  [length]Size = 1
---------------------------------------------------------------------------------
                | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
----------------+----------------------------------------------------------------
trunk           |
        foreign |   3.821653   1.607966     2.38   0.017     .6700972     6.97321
           Size |   .2909373   .0583336     4.99   0.000     .1766055    .4052691
          _cons |      14.75   .5497751    26.83   0.000     13.67246    15.82754
----------------+----------------------------------------------------------------
length          |
           Size |          1  (constrained)
          _cons |   195.2516    2.62974    74.25   0.000     190.0974    200.4058
----------------+----------------------------------------------------------------
headroom        |
           Size |   .0308984   .0051064     6.05   0.000     .0208901    .0409068
          _cons |   3.219393   .1007595    31.95   0.000     3.021908    3.416878
----------------+----------------------------------------------------------------
Size            |
        foreign |  -24.61892   4.777411    -5.15   0.000    -33.98248   -15.25537
----------------+----------------------------------------------------------------
     var(e.Size)|    183.886   57.51064                      99.61717    339.4403
----------------+----------------------------------------------------------------
    var(e.trunk)|   .1521919   2.760398                      5.54e-17    4.18e+14
   var(e.length)|   178.5844   44.50906                      109.5712    291.0655
 var(e.headroom)|   .4095916   .0745267                      .2867291    .5851003
---------------------------------------------------------------------------------

None of the results is wrong -- all of the numbers are the same, albeit with different orderings -- but it is weird that seemingly trivial changes in syntax cause the layout of results to change. I wonder if the inconsistencies would occur if all variables are observed or if everything would come out consistently.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 18.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam

Comment

Dan Obst

Join Date: Jan 2025

Posts: 3
#6

21 Jan 2025, 04:57

I’d be fine if this only affected the layout — as you said, the numbers are the same. The problem arises when esttab tries to interpret gsem’s output and fails (see post #1). It seems that esttab relies on a consistent table layout, which I think is reasonable.

Thankfully, I found a workaround in post #3. However, I don’t think this behavior is intentional, and it should probably be fixed.

Regarding your comment on my seemingly odd model in #4:
I’m trying to model the second structure in this diagram: https://link.springer.com/article/10...51-0/figures/2 (as proposed in this paper: https://link.springer.com/article/10...428-013-0351-0.)

The idea is that an experimental condition might aim to change something (in this case, perceived inequality) to see how it affects something else (in this case, worries). Traditionally, one would test separately whether the treatment affects the manipulation check (in this case, the two Likert-scale questions on inequality) and whether the treatment influences worries. However, it’s possible that the treatment does not only affect perceived inequality but also, for instance, perceived social mobility. While the paper mentioned above explicitly addresses mental states (rather than perceptions), I believe the same reasoning applies to perceptions.

Any comments on the model and theory would be greatly appreciated, as my advisor and colleagues are not familiar with SEM or this particular theoretical framework. (The perks of doing a PhD at a highly interdisciplinary institute.)
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4870
#7

21 Jan 2025, 07:28

So if I follow you, the difference between what you are doing and what the diagrams show is that you have two manipulation checks rather than one. You therefore treat manipulation check as a latent variable with 2 indicators.

Would it do violence to your work if you only used one manipulation check, which I think would make your model identical to the diagrams? Or, keep both checks, but don't try to make a latent variable out of them, which means you'll add some paths to the diagrams you show? Have you actually tested whether the latent variable is justified? For your model as shown, I think using a latent variable only saves you 1 df, but maybe your models get more complicated.

You shouldn't choose your model based on what gives you the most convenient output, but at the same time I wouldn't just assume the two checks make for a single valid latent variable.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 18.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014

Posts: 646
#8

Yesterday, 18:24

Dan Obst found a bug in gsem, triggered by specifying a latent endogenous variable before any of the observed endogenous variables. The elements of one of the observed endogenous variable's equation should not have been split as shown in his example. We hope to fix this bug in Stata 18 soon.

Last edited by Jeff Pitblado (StataCorp); Yesterday, 18:36.
Comment

Announcement

GSEM sometimes breaks single equations into two equations. Bug?

Comment

Comment

Comment

Comment

Comment

Comment

Comment