Multiple gsem estimates not combining in final model

Jiannbin Shiao

Join Date: Aug 2020

Posts: 9
#1

Multiple gsem estimates not combining in final model

11 Jan 2022, 12:59

Hi Statalist,

I’m trying to run a gsem model for a multilevel path analysis in Stata 15.1 in Windows 10. My data has 4,888 observations (survey respondents from a multistage sample), and my target model has 19 observed variables and 2 latent variables.

I use my primary latent variable (Cultcap) as an exogenous variable in each of three paths, and I use the second latent variable to model a second level (the sample of schools that the respondents attended in adolescence) as a random intercept in each path. The outcome of the first path is continuous (ctr_gpa), the outcome of the second path is ordinal (educ_byw5), and the outcome of the final path is continuous (income_h5).

Specifically, I’m trying to estimate the following model:

Code:

gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) /// (ctr_cumgpa <- Cultcap /// ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi /// par_married anyrelig /// M1[schools]@1, regress) /// (educ_byw5 <- Cultcap ctr_cumgpa /// ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi /// M1[schools]@1, ologit) /// (income_h5 <- Cultcap ctr_cumgpa ib1.educ_byw5) /// reg_migrant child_lt6hh /// ib4.racesingle i.female immigpar polstopbef18 /// pars_postgradboth /// M1[schools]@1, regress) /// if sample_id==1, latent(Cultcap M1)

This model does not converge by itself. Instead Stata only gets through fitting the fixed-effects model, tries to refine starting values but returns zeros for log likelihood, and lastly returns the error message “initial values not feasible” when trying to fit the full model.

Following the recommendations in the Stata manual on “Convergence problems…” (semintro12.pdf), I’ve used the solution of temporarily simplifying the model, storing estimates from simpler models (i.e., “matrix b =e(b)” after convergence), and using them as starting values for more complex models (i.e., the “from(b)” option). The manual also recommends trying alternative integration methods, fewer integration points, and alternative starting-value-calculation methods, as needed. By trying different subsets of each path, supplemented with alternative settings in one case, I’ve successfully gotten all three paths to converge in separate models. Each of the three models pairs the measurement model with a single path. Here’re the final models for each path:

Code:

gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) /// (ctr_cumgpa <- Cultcap /// ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi /// par_married anyrelig /// M1[schools]@1, regress) /// if sample_id==1, latent(Cultcap M1) /// from(b) matrix b = e(b) gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) /// (educ_byw5 <- Cultcap ctr_cumgpa /// ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi /// M1[schools]@1, ologit) /// if sample_id==1, latent(Cultcap M1) /// from(c) matrix c = e(b) gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) /// (income_h5 <- Cultcap ctr_cumgpa ib1.educ_byw5) /// reg_migrant child_lt6hh /// ib4.racesingle i.female immigpar polstopbef18 /// pars_postgradboth /// M1[schools]@1, regress) /// if sample_id==1, latent(Cultcap M1) /// from(d) matrix d = e(b)

My problem arises when I try to combine all three estimates into the full path analysis:

Code:

gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) /// (ctr_cumgpa <- Cultcap /// ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi /// par_married anyrelig /// M1[schools]@1, regress) /// (educ_byw5 <- Cultcap ctr_cumgpa /// ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi /// M1[schools]@1, ologit) /// (income_h5 <- Cultcap ctr_cumgpa ib1.educ_byw5) /// reg_migrant child_lt6hh /// ib4.racesingle i.female immigpar polstopbef18 /// pars_postgradboth /// M1[schools]@1, regress) /// if sample_id==1, latent(Cultcap M1) /// from(b c d)

At this point, Stata immediately returns the error: “initial vector: duplicate entries for acad_pcact:Cultcap found”, as well as return code 507: “name conflict”. It sounds as if Stata cannot estimate my primary latent variable in more than one path. Is that the right interpretation?

In full disclosure, I’ve also run a version of the full, multilevel path analysis minus the primary latent variable, and that 3-path model converges without needing any intermediate stages of storing estimates for use as starting values. I’ve also run a version that separates (1) the measurement model to a separate model, which I used to generate predicted values for the primary latent variable and (2) a 3-path model that substitutes the predicted values variable for the latent variable in each path. To run the 3-path model, I used the Stata manual recommendations to get each path to converge separately, before trying to combine all three estimates. The result was that Stata returned a similar error: “initial vector: duplicate entries for /:var(M1[schools]) found.” In brief, I believe the problem is confined to combining my latent variable estimates.

Is gsem unable to combine matrices with identically named latent variables, whereas it can handle identically named observed variables?

Is the solution simply renaming the latent variables uniquely for each path to be combined (i.e., Cultcap1, Cultcap2, Cultcap3, M1, M2, M3)? Is it really necessary to inflate the number of variables in the overall model?

Or is there a better solution? Is there any mistake in my code causing these problems? Or is the complexity of the model just too much for the data or for gsem?

Thanks,
J
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

11 Jan 2022, 13:36

Well, I am not certain of my answer here. I think you can use Cultcap in all three paths without problems. But I don't think it even makes sense to use M1[schools] the way your code attempts. You are, with that code, constraining the school-level random intercept value to be the same in all three paths. I don't think that constraint would make any sense even if all three paths used the same regression link, but I think it is even more of a stretch given that two of the paths involve linear regression and one is ordinal logistic. It seems like an extremely far-fetched constraint and I cannot visualize any real-world situation in which it would make sense to do that. Have you tried using three separate latent variables for the random intercepts, but keeping Cultcap as is? (I realize the error message you got singles out Cultcap, but error messages aren't always precise, particularly when Stata encounters a complicated model that it cannot fully make sense of. It does its best to sort out its confusion, but sometimes misunderstands the source of the problem.)
Comment

Jiannbin Shiao

Join Date: Aug 2020
Posts: 9

12 Jan 2022, 11:49

Hi Clyde, thanks for your reply! I agree with you that how I'm coding the random intercept, M1[schools], makes little sense.

I tried what you suggested (i.e., keeping Cultcap as is, while using three separate random intercepts, one for each path) with the following code (same code in my original post):

Code:

* No change to first block of commands
gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (ctr_cumgpa <- Cultcap ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        par_married anyrelig ///
        M1[schools]@1, regress)  ///
    if sample_id==1, latent(Cultcap M1) ///
    from(b)
matrix b = e(b)

*Changes begin here
gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (educ_byw5 <- Cultcap ctr_cumgpa ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        M2[schools]@1, ologit) ///
    if sample_id==1, latent(Cultcap M2) ///
    from(c)
matrix c = e(b)

gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (income_h5 <- Cultcap ctr_cumgpa ib1.educ_byw5) ///
        reg_migrant child_lt6hh ///
        ib4.racesingle i.female immigpar polstopbef18 ///
        pars_postgradboth ///
        M3[schools]@1, regress) ///
    if sample_id==1, latent(Cultcap M3) ///
    from(d)
matrix d = e(b)

gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (ctr_cumgpa <- Cultcap ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        par_married anyrelig ///
        M1[schools]@1, regress)  ///
    (educ_byw5 <- Cultcap ctr_cumgpa ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        M2[schools]@1, ologit) ///
    (income_h5 <- Cultcap ctr_cumgpa ib1.educ_byw5) ///
        reg_migrant child_lt6hh ///
        ib4.racesingle i.female immigpar polstopbef18 ///
        pars_postgradboth ///
        M3[schools]@1, regress) ///
    if sample_id==1, latent(Cultcap M1 M2 M3) ///
    from(b c d)

Unfortunately, I got the same errors: “initial vector: duplicate entries for acad_pcact:Cultcap found”, along with return code 507: “name conflict.”

I then tried using both separate random intercepts and separate latent variables (Cultcap) for each path (code below), and I got a somewhat different error: "initial vector: duplicate entries for acad_pcact:_cons found", along with return code 507: “name conflict.”

Code:

gsem (Cultcap1 -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (ctr_cumgpa <- Cultcap1 ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        par_married anyrelig ///
        M1[schools]@1, regress)  ///
    if sample_id==1, latent(Cultcap1 M1) ///
    from(b)
matrix b = e(b)

gsem (Cultcap2 -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (educ_byw5 <- Cultcap2 ctr_cumgpa ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        M2[schools]@1, ologit) ///
    if sample_id==1, latent(Cultcap2 M2) ///
    from(c)
matrix c = e(b)

gsem (Cultcap3 -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (income_h5 <- Cultcap3 ctr_cumgpa ib1.educ_byw5) ///
        reg_migrant child_lt6hh ///
        ib4.racesingle i.female immigpar polstopbef18 ///
        pars_postgradboth ///
        M3[schools]@1, regress) ///
    if sample_id==1, latent(Cultcap3 M3) ///
    from(d)
matrix d = e(b)

gsem (Cultcap1 -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (Cultcap2 -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (Cultcap3 -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (ctr_cumgpa <- Cultcap1 ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        par_married anyrelig ///
        M1[schools]@1, regress)  ///
    (educ_byw5 <- Cultcap2 ctr_cumgpa ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        M2[schools]@1, ologit) ///
    (income_h5 <- Cultcap3 ctr_cumgpa ib1.educ_byw5) ///
        reg_migrant child_lt6hh ///
        ib4.racesingle i.female immigpar polstopbef18 ///
        pars_postgradboth ///
        M3[schools]@1, regress) ///
    if sample_id==1, latent(Cultcap1 Cultcap2 Cultcap3 M1 M2 M3) ///
    from(b c d)

It seems that both error messages single out the first indicator (acad_pcact) in my primary latent variable Cultcap, but they single out slightly different cells (?) in the matrices that I'm asking gsem to combine.

I suspect you're still right about using separate random intercepts, but I don't think gsem is even getting to them when I try to combine all three estimates.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#4

12 Jan 2022, 12:56

I have to say that I do not understand what is going on here. I hope somebody else sees the problem and can propose a solution.
Comment
Roman Mostazir

Join Date: Apr 2014

Posts: 874
#5

12 Jan 2022, 16:50

I don't understand the codes of the original post #1. There is imbalanced parenthesis after this portion (my remarks are in red). This gets more confusing when you said #2 that you successfully ran the codes using from(b) with the same codes .

Code:

(income_h5 <- Cultcap ctr_cumgpa ib1.educ_byw5) reg_migrant child_lt6hh // NO PARENTHESIS HERE ib4.racesingle i.female immigpar polstopbef18 pars_postgradboth M1[schools]@1, regress)

Try correcting these. Also, try the options such as: diff technique(bhhh 10 nr)

Present some data example using -dataex- as this will help others to investigate the problem more efficiently.

Roman
Comment

Jiannbin Shiao

Join Date: Aug 2020
Posts: 9

13 Jan 2022, 13:02

Hi Clyde, even if your suggestions didn't solve my problem, thanks for trying! I appreciate it!

Hi Roman, I'm afraid the imbalanced parenthesis is a typo that I accidentally introduced by trying to simplify messy code for posting to Statalist. The actual code that I run uses macros that I thought might be confusing. I've included the corrected code below, now updated with (1) Clyde's recommendation to use separate random intercepts for each path and but (2) still without macros. If you think showing the code with macros would be more clear, I can provide that. The corrected code also adds your recommended option "diff technique(bhhh 10 nr)" to the final model for combining the estimates from each path.

Code:

/* Last model in series of increasingly more complex models for "converging"
    1st path (outcome: ctr_cumgpa) */
gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (ctr_cumgpa <- Cultcap ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        par_married anyrelig ///
        M1[schools]@1, regress)  ///
    if sample_id==1, latent(Cultcap M1) ///
    from(b)
matrix b = e(b)

/* Last model in series of increasingly more complex models for "converging"
    2nd path (outcome: educ_byw5) */
gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (educ_byw5 <- Cultcap ctr_cumgpa ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        M2[schools]@1, ologit) ///
    if sample_id==1, latent(Cultcap M2) ///
    from(c)
matrix c = e(b)

/* Last model in series of increasingly more complex models for "converging"
    3rd path (outcome: income_h5) */
gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (income_h5 <- Cultcap ctr_cumgpa ib1.educ_byw5 ///
        reg_migrant child_lt6hh ///
        ib4.racesingle i.female immigpar polstopbef18 ///
        pars_postgradboth ///
        M3[schools]@1, regress) ///
    if sample_id==1, latent(Cultcap M3) ///
    from(d)
matrix d = e(b)

/* Model for combining estimates from 3 paths */
gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (ctr_cumgpa <- Cultcap ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        par_married anyrelig ///
        M1[schools]@1, regress)  ///
    (educ_byw5 <- Cultcap ctr_cumgpa ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        M2[schools]@1, ologit) ///
    (income_h5 <- Cultcap ctr_cumgpa ib1.educ_byw5 ///
        reg_migrant child_lt6hh ///
        ib4.racesingle i.female immigpar polstopbef18 ///
        pars_postgradboth ///
        M3[schools]@1, regress) ///
    if sample_id==1, latent(Cultcap M1 M2 M3) ///
    diff technique(bhhh 10 nr) ///
    from(b c d)

Unfortunately, this updated code with your recommended option still generates the original errors: "initial vector: duplicate entries for acad_pcact:Cultcap found", along with return code 507: “name conflict.”

After reading the description of the difficult and technique options (in rmaximize.pdf), I'm not sure how these would help combine the latent variable estimates, as they seem to be for helping with model convergence, which I believe I had solved with the Stata solution of intermediate stages. However, I did try re-running the target model in full (without intermediate stages) but amended with your recommended option.

Code:

/* Full model with maximization options and without intermediate stages */
gsem (Cultcap -> acad_pcact parcontrol parexp_educ educ_effort) ///
    (ctr_cumgpa <- Cultcap ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        par_married anyrelig ///
        M1[schools]@1, regress)  ///
    (educ_byw5 <- Cultcap ctr_cumgpa ///
        ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi ///
        M2[schools]@1, ologit) ///
    (income_h5 <- Cultcap ctr_cumgpa ib1.educ_byw5 ///
        reg_migrant child_lt6hh ///
        ib4.racesingle i.female immigpar polstopbef18 ///
        pars_postgradboth ///
        M3[schools]@1, regress) ///
    if sample_id==1, latent(Cultcap M1 M2 M3) ///
    diff technique(bhhh 10 nr)

Unfortunately, this replicates the convergence problem that the intermediate stages had solved. From my original post: Stata only gets through fitting the fixed-effects model, tries to refine starting values but returns zeros for log likelihood, and lastly returns the error message “initial values not feasible” when trying to fit the full model.

I now realize that the intermediate stages might solve the convergence problem for each path while also creating the problem with combining latent-variable estimates across paths. Are there other maximization options that might work and circumvent the need for intermediate stages?

Lastly, thanks for recommending dataex, which I haven't used. Here is the code that I ran:

Code:

randomtag if sample_id==1, count(10) gen(pick)
dataex acad_pcact parcontrol parexp_educ educ_effort ///
    ctr_cumgpa educ_byw5 income_h5 ///
    schools ///
    racesingle female immigpar polstopbef18 region pars_edhi ///
    par_married anyrelig ///
    reg_migrant child_lt6hh ///
    pars_postgradboth /// 
    if pick, var

And here is the output:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(acad_pcact parcontrol parexp_educ educ_effort ctr_cumgpa educ_byw5) double income_h5 str3 schools float(racesingle female immigpar polstopbef18) byte region float(pars_edhi par_married anyrelig reg_migrant child_lt6hh pars_postgradboth)
2 1 1 1  -.6980439 1  45000 "047" 4 1 0 0 3 2 1 1 0 0 0
0 1 2 1  -1.797044 0  35000 "058" 5 0 0 0 2 1 1 1 0 0 0
1 1 2 2   1.437956 3 250000 "031" 4 0 0 0 3 2 1 1 0 1 0
3 1 1 0   -.192044 1  22500 "008" 4 1 0 0 2 0 1 1 0 1 0
0 0 0 1    .471956 1 125000 "269" 5 1 0 0 4 1 0 1 0 1 0
2 3 2 1   .4379561 2  45000 "020" 3 1 0 0 3 3 1 1 1 0 1
3 0 2 2   .6489561 1  35000 "057" 4 1 0 0 3 1 1 1 0 0 0
3 1 2 2  1.0489562 2  35000 "077" 2 1 1 0 1 2 1 1 0 0 0
1 1 2 1 -.21404386 2  45000 "050" 3 1 0 0 3 1 0 1 0 0 0
2 1 2 1   .6509562 3 125000 "270" 4 1 0 0 4 3 1 1 0 1 1
end
label values parexp_educ parexp_lab
label def parexp_lab 0 "No high-educ expect", modify
label def parexp_lab 1 "High hs-expect only", modify
label def parexp_lab 2 "High coll expect", modify
label values educ_effort edeff_lab
label def edeff_lab 0 "I don't or never try hard", modify
label def edeff_lab 1 "I try hard enough", modify
label def edeff_lab 2 "I try hard to do my best", modify
label values educ_byw5 r_educ_lab
label def r_educ_lab 0 "LT HS", modify
label def r_educ_lab 1 "HS only", modify
label def r_educ_lab 2 "BA/BS only", modify
label def r_educ_lab 3 "Any Postgrad", modify
label values racesingle rcs_lab_MCO
label def rcs_lab_MCO 2 "Asian/PI", modify
label def rcs_lab_MCO 3 "Black/African", modify
label def rcs_lab_MCO 4 "White", modify
label def rcs_lab_MCO 5 "Multiracial", modify
label values region region_lab
label def region_lab 1 "West", modify
label def region_lab 2 "Midwest", modify
label def region_lab 3 "South", modify
label def region_lab 4 "Northeast", modify
label values pars_edhi parsedhi_lab
label def parsedhi_lab 0 "LT HS only", modify
label def parsedhi_lab 1 "Any HS", modify
label def parsedhi_lab 2 "Any BA/BS", modify
label def parsedhi_lab 3 "Any Postgrad", modify
label var income_h5 "S4Q1 INCOME PERS EARNINGS 16/17-W5" 
label var region "region"

I hope this helps clarify my data and the problems that I'm having. Thanks in advance for any suggestions!

Comment

Roman Mostazir

Join Date: Apr 2014

Posts: 874
#7

13 Jan 2022, 16:08

Originally posted by Jiannbin Shiao View Post

The actual code that I run uses macros that I thought might be confusing..... If you think showing the code with macros would be more clear, I can provide that.

- It's not what I think, it is the forum advice point#12.1 in the FAQ section that reads, "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

- Thanks for using -dataex- but the example is useless with just 10 samples for a multi-level model if anyone wants to try your data and commands.

- The data example does not have a variable called 'sample_id' while all your commands conditioned on 'sample_id'. Its not going to help.

I am not sure if I will have time to followup this post anymore but the above concerns are for you to note and to inform others who might want to chime in and help.

Roman
Comment

Jiannbin Shiao

Join Date: Aug 2020
Posts: 9

14 Jan 2022, 13:10

Hi Roman,

I apologize if my relative inexperience with Statalist is making responding to my post more challenging than it should be. Thanks for your suggestions, especially about using the maximization options.

How much data would I need to share to make this useful? I’m unsure whether I can post much more of the data, as it is confidential and covered by a data-use agreement. If you, or anyone else, gave me a rough N of sufficient observations, I can investigate whether it’s permissible.

As for providing the exact executable code, I’m hoping the code below clarifies what I’m trying to do.

Best,
J

Code:

/* Macros */ 
local CCVarsNofactor acad_pcact parcontrol parexp_educ educ_effort
local ExogVars ib4.racesingle i.female immigpar polstopbef18 ib1.region ib1.pars_edhi 
local GPAVars par_married anyrelig 
local IncomeVars reg_migrant child_lt6hh ib1.educ_byw5 ctr_cumgpa 
local ExogVarsRed ib4.racesingle i.female immigpar polstopbef18 pars_postgradboth 
local DepVar income_h5
local regproc regress
local SampleRestrictVar "sample_id==1" 

/* Run path model for latent variable alone */
            
gsem (Cultcap -> `CCVarsNofactor') ///
    /* (ctr_cumgpa <- Cultcap `ExogVars' `GPAVars' M1[schools]@1, `regproc') */ ///
    /* (educ_byw5 <- Cultcap ctr_cumgpa `ExogVars' M1[schools]@1, ologit) */ ///
    /* (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' M1[schools]@1, `regproc') */ ///
    if `SampleRestrictVar', latent(Cultcap)
matrix a = e(b)
estat ic

/* Run path for Latent & GPA path alone */
            
* Use saved estimates as starting values in next stage of model (partial path-> GPA)
gsem (Cultcap -> `CCVarsNofactor') ///
    (ctr_cumgpa <- Cultcap ib4.racesingle /* `ExogVars' `GPAVars' */ M1[schools]@1, `regproc') ///
    /* (educ_byw5 <- Cultcap ctr_cumgpa `ExogVars' M1[schools]@1, ologit) */ ///
    /* (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' M1[schools]@1, `regproc') */ ///
    if `SampleRestrictVar', latent(Cultcap M1) ///
    from(a)
matrix b = e(b)
estat ic
            
* Use saved estimates as starting values in next stage of model (partial path-> GPA)
gsem (Cultcap -> `CCVarsNofactor') ///
    (ctr_cumgpa <- Cultcap ib4.racesingle i.female /* `ExogVars' `GPAVars' */ M1[schools]@1, `regproc') ///
    /* (educ_byw5 <- Cultcap ctr_cumgpa `ExogVars' M1[schools]@1, ologit) */ ///
    /* (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' M1[schools]@1, `regproc') */ ///
    if `SampleRestrictVar', latent(Cultcap M1) intmethod(ghermite) startvalues(zero) intpoints(3) ///
    from(b)
matrix b = e(b)
estat ic
            
* Use saved estimates as starting values in next stage of model (partial path-> GPA)
gsem (Cultcap -> `CCVarsNofactor') ///
    (ctr_cumgpa <- Cultcap ib4.racesingle i.female /* `ExogVars' */ `GPAVars' M1[schools]@1, `regproc') ///
    /* (educ_byw5 <- Cultcap ctr_cumgpa `ExogVars' M1[schools]@1, ologit) */ ///
    /* (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' M1[schools]@1, `regproc') */ ///
    if `SampleRestrictVar', latent(Cultcap M1) intmethod(ghermite) startvalues(zero) intpoints(3) ///
    from(b)
matrix b = e(b)
estat ic
            
* Use saved estimates as starting values in next stage of model (full path-> GPA)
gsem (Cultcap -> `CCVarsNofactor') ///
    (ctr_cumgpa <- Cultcap `ExogVars' `GPAVars' M1[schools]@1, `regproc') ///
    /* (educ_byw5 <- Cultcap ctr_cumgpa `ExogVars' M1[schools]@1, ologit) */ ///
    /* (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' M1[schools]@1, `regproc') */ ///
    if `SampleRestrictVar', latent(Cultcap M1) intmethod(ghermite) startvalues(zero) intpoints(3) ///
    from(b)
matrix b = e(b)
estat ic
            
* Use saved estimates as starting values in next stage of model (full path-> GPA)
gsem (Cultcap -> `CCVarsNofactor') ///
    (ctr_cumgpa <- Cultcap `ExogVars' `GPAVars' M1[schools]@1, `regproc') ///
    /* (educ_byw5 <- Cultcap ctr_cumgpa `ExogVars' M1[schools]@1, ologit) */ ///
    /* (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' M1[schools]@1, `regproc') */ ///
    if `SampleRestrictVar', latent(Cultcap M1) ///
    from(b)
matrix b = e(b)
estat ic
            
/* Run path for Latent & Educ path alone */
            
* Use saved estimates as starting values in next stage of model (partial path-> Educ)
gsem (Cultcap -> `CCVarsNofactor') ///
    /* (ctr_cumgpa <- Cultcap /* `ExogVars' `GPAVars' */ M1[schools]@1, `regproc') */ ///
    (educ_byw5 <- Cultcap /* immigpar polstopbef18 `ExogVars' */ M2[schools]@1, ologit) ///
    /* (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' M1[schools]@1, `regproc') */ ///
    if `SampleRestrictVar', latent(Cultcap M2) ///
    from(a)
matrix c = e(b)
estat ic
            
* Use saved estimates as starting values in next stage of model (partial path-> Educ)
gsem (Cultcap -> `CCVarsNofactor') ///
    /* (ctr_cumgpa <- Cultcap /* `ExogVars' `GPAVars' */ M1[schools]@1, `regproc') */ ///
    (educ_byw5 <- Cultcap /* ctr_cumgpa */ `ExogVars' M2[schools]@1, ologit) ///
    /* (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' M1[schools]@1, `regproc') */ ///
    if `SampleRestrictVar', latent(Cultcap M2) ///
    from(c)
matrix c = e(b)
estat ic
            
* Use saved estimates as starting values in next stage of model (full path-> Educ)
gsem (Cultcap -> `CCVarsNofactor') ///
    /* (ctr_cumgpa <- Cultcap /* `ExogVars' `GPAVars' */ M1[schools]@1, `regproc') */ ///
    (educ_byw5 <- Cultcap ctr_cumgpa `ExogVars' M2[schools]@1, ologit) ///
    /* (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' M1[schools]@1, `regproc') */ ///
    if `SampleRestrictVar', latent(Cultcap M2) ///
    from(c)
matrix c = e(b)
estat ic
            
/* Run path for Latent & Income path alone */
            
* Use saved estimates as starting values in next stage of model (partial path -> Income)
gsem /* (Cultcap -> `CCVarsNofactor') */ ///
    /* (ctr_cumgpa <- Cultcap /* `ExogVars' `GPAVars' */ M1[schools]@1, `regproc') */ ///
    /* (educ_byw5 <- Cultcap /* ctr_cumgpa `ExogVars' */ M1[schools]@1, ologit) */ ///
    (`DepVar' <- /* Cultcap */ `IncomeVars' `ExogVarsRed'  M3[schools]@1, `regproc') ///
    if `SampleRestrictVar', latent(M3) ///
    /* from(a) */
matrix d = e(b)
estat ic
            
* Use saved estimates as starting values in next stage of model (full path -> Income)
gsem (Cultcap -> `CCVarsNofactor') ///
    /* (ctr_cumgpa <- Cultcap /* `ExogVars' `GPAVars' */ M1[schools]@1, `regproc') */ ///
    /* (educ_byw5 <- Cultcap /* ctr_cumgpa `ExogVars' */ M1[schools]@1, ologit) */ ///
    (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' /* */ M3[schools]@1, `regproc') ///
    if `SampleRestrictVar', latent(Cultcap M3) intmethod(ghermite) startvalues(zero) intpoints(3) ///
    from(a d)
matrix d = e(b)
estat ic
            
* Use saved estimates as starting values in next stage of model (full path -> Income)
gsem (Cultcap -> `CCVarsNofactor') ///
    /* (ctr_cumgpa <- Cultcap /* `ExogVars' `GPAVars' */ M1[schools]@1, `regproc') */ ///
    /* (educ_byw5 <- Cultcap /* ctr_cumgpa `ExogVars' */ M1[schools]@1, ologit) */ ///
    (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' /* */ M3[schools]@1, `regproc') ///
    if `SampleRestrictVar', latent(Cultcap M3)  ///
    from(d)
matrix d = e(b)
estat ic
            
/* Combine estimates in multi-path model */
            
* Use saved estimates as starting values in next stage of model (3 paths -> full model)
gsem (Cultcap -> `CCVarsNofactor') ///
    (ctr_cumgpa <- Cultcap `ExogVars' `GPAVars' M1[schools]@1, `regproc') ///
    (educ_byw5 <- Cultcap ctr_cumgpa `ExogVars' M2[schools]@1, ologit) ///
    (`DepVar' <- Cultcap `IncomeVars' `ExogVarsRed' M3[schools]@1, `regproc') ///
    if `SampleRestrictVar', latent(Cultcap M1 M2 M3) ///
    diff technique(bhhh 10 nr) ///
    from(b c d)
estat ic

Comment

Jiannbin Shiao

Join Date: Aug 2020

Posts: 9
#9

20 Jan 2022, 11:18

I was able to combine the three sets of estimates in a gsem model that converged after 2 hours and 48 minutes. Here's a final report on my problem, in case this solution might be useful for anyone else:

I succeeded by combining tips from Clyde, Roman, and a specialist in maximum likelihood issues, who is a friend of a colleague of mine. They noticed that gsem was saving estimates for each of my latent variable’s indicators from each path, with the result that when I told gsem to use all three matrices for the full run [i.e., "from(b c d)"], it didn't know which of the estimates to use. To solve this problem, I extracted subvectors of matrices b and c without the Cultcap components before combining them with matrix d [i.e., “from(b1 c1 d)]”.

They also noticed that my third path's dependent variable (income in dollars) might not be "very numerically well-conditioned" and recommended logging it or recasting it in thousands or ten-thousands. Recasting income in ten-thousands was what did the trick. Neither logging income nor recasting it in thousands worked, but I might re-try the log income version giving it more time to converge.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#10

20 Jan 2022, 12:02

Thanks for the follow-up.
Comment

Announcement