Checking for global maximum in LCA (latent class analysis)

Giulia Vivaldi

Join Date: Jan 2023
Posts: 8

Checking for global maximum in LCA (latent class analysis)

05 Jan 2023, 05:15

Hi team,

I am looking for help in assessing the usability of my LCA model, fit using gsem. I have 14,000 observations on around 25 variables which measure presence of symptoms; the majority of these variables are binary or ordinal. An example of some of these variables is shown at the bottom of my post.

With some work, I have obtained models that converge with up to five classes, without having to impose constraints on any variables. I have no pre-existing hypotheses on the number of expected classes.

The model statistics recommended in Masyn's chapter suggest the four-class or five-class models may provide the best fit and should be chosen. However, when I try to check whether the solutions identified by these models represent global maximums in the likelihood function, I run into problems. I am testing for a global maximum by running the model 100 times with random draws, using the following code based on posts by Weiwen Ng:

Code:

*Iteration log: four class
putexcel set iteration_log.xlsx, sheet(fourclass) modify
putexcel A1 = "Iteration" B1 = "Log Likelihood" C1 = "Converged"
set seed 123321

forvalues i = 1/100 {
    
    local j = `i' + 1

gsem (lc_cough_m_b lc_sleep_m_b lc_memory_m_b lc_concen_m_b         ///
      lc_musc_pain_m_b lc_tastesmell_m_b lc_diarrhoea_m_b            ///
      lc_stomach_m_b lc_voice_m_b lc_hair_m_b lc_heart_m_b             ///
      lc_dizzy_m_b lc_sweat_m_b                                     ///
      EQ5D_self_m_revised EQ5D_mob_m_revised<-, logit)                ///Binary
     (mrc_m_revised PHQ4_final_m EQ5D_act_m EQ5D_pain_m <-, ologit)                ///Ordinal
     (facit_score_m EQ5D_score_m <- )                                ///Gaussian
     , lclass(C4_ 4)                                                ///Fitting four classes
     startvalues(randompr, draws(1)) iterate(100)
    
     estimates save class4, append
     
     putexcel A`j' = `i'
     putexcel B`j' = `e(ll)'
     putexcel C`j' = `e(converged)'
}

putexcel close

However, none or very few of the models converge, and I frequently end up with the following error, which breaks the loop:

Code:

cannot compute an improvement -- discontinuous region encountered
r(430);

Unsurprisingly, this problem tends to occur when obtaining the original convergence was a bit more of a struggle—eg, I obtained it by saving the parameters of a simpler models (ie, matrix b4 = e(b)) and using those as starting values for progressively more complicated models (ie, from(b4)). The impression I get is that, in doing so, I have pinpointed a very rare point in the likelihood function where convergence is possible, and the 100 random draws are therefore unlikely to find this point. However, including more random draws isn't really an option, as these discontinuous regions break the loop and it is taking me forever to even get 100 runs completed.

My question is, does this mean that, despite having the best fit statistics, the model is too weakly identified to be usable, and I should just stick with simpler models that successfully converge on multiple random draws? Or is there any other way of establishing whether these models are usable?

Many thanks to everyone who has posted on LCA in the past (with particular thanks to Weiwen Ng). I think I have read every LCA post on this forum!

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(lc_cough_m_b lc_sleep_m_b lc_memory_m_b lc_concen_m_b lc_musc_pain_m_b lc_tastesmell_m_b lc_diarrhoea_m_b EQ5D_self_m_revised EQ5D_mob_m_revised) byte mrc_m_revised float(PHQ4_final_m EQ5D_act_m facit_score_m EQ5D_score_m)
0 0 0 1 0 0 0 0 0 1 1 1 41  71
0 1 0 0 0 0 0 0 0 1 3 2 31  50
0 1 1 1 1 1 0 0 0 2 0 1 29  47
0 0 0 0 0 0 0 0 0 2 0 1 50  69
1 0 1 0 1 0 0 0 0 1 0 1 49  71
0 1 1 0 1 0 0 0 0 1 2 1 45  85
0 0 0 0 0 0 0 0 0 1 0 1 44  92
0 1 0 1 0 0 1 0 0 2 0 1 43  86
1 0 1 1 1 0 0 1 1 3 2 2 22  50
0 1 0 0 0 0 0 0 0 1 0 1 48  69
0 0 0 0 0 0 0 0 0 1 0 1 52  91
0 1 1 1 0 0 0 0 0 1 1 1 50  74
1 1 1 1 1 1 1 0 0 2 1 2 13  50
1 1 0 0 1 0 0 0 0 2 0 1 49  74
0 0 0 0 0 0 0 0 0 1 0 1 51  90
0 1 1 1 1 0 0 1 1 3 1 2 36  38
0 0 0 0 0 0 0 0 0 1 0 1 50  86
0 0 0 0 0 0 0 0 0 1 0 1 45  81
0 0 0 0 0 0 0 0 0 1 0 1 52  98
1 0 0 0 1 0 0 0 0 1 0 1 45  90
1 0 0 0 1 0 0 0 0 1 0 2 41  60
0 1 1 1 1 0 1 0 0 1 0 1 42  50
0 0 0 0 1 0 0 0 0 1 0 1 48  80
0 0 0 0 0 0 0 0 0 2 0 1 47  69
0 0 1 1 1 0 0 0 0 2 1 2 38  70
0 1 0 1 1 0 1 1 1 2 1 2 16  50
0 1 1 1 1 0 1 0 0 2 1 1 21  50
0 0 0 0 0 1 0 0 0 1 0 1 49  85
0 1 0 1 0 0 0 0 0 1 1 1 36  87
0 1 1 1 1 0 0 0 0 1 0 1 45  50
0 1 0 0 1 0 0 0 0 1 0 1 43  89
0 1 1 0 0 0 0 0 0 1 0 1 49  73
0 1 0 0 0 0 0 0 0 2 0 1 49  87
0 1 0 1 1 0 0 0 0 2 0 1 45  70
0 1 0 0 0 0 0 0 0 1 1 1 44  81
0 0 0 0 0 0 0 0 0 1 0 1 52  97
1 1 1 0 1 1 1 0 0 2 1 1 35  83
0 0 0 0 1 0 0 0 1 2 0 1 49  71
0 1 0 0 0 0 0 0 0 1 0 1 50  96
0 0 1 1 1 1 1 0 1 3 0 2 14  50
1 0 0 0 1 0 0 0 1 3 0 1 44  50
0 1 0 0 1 0 0 0 0 1 0 1 49  80
0 0 0 0 1 0 0 1 0 1 0 2 46  80
0 0 0 0 0 0 0 0 0 2 0 1 48  92
0 1 1 0 1 0 0 0 0 1 0 1 50  84
0 0 0 0 0 0 0 0 0 2 3 1 49  38
0 0 0 0 1 0 0 0 0 1 0 1 52  95
0 0 0 0 0 0 0 0 0 1 0 1 48  81
0 0 0 0 0 0 0 0 0 2 0 1 50  80
0 0 0 0 1 0 1 0 0 1 2 2 38  50
0 0 0 0 0 0 0 0 0 1 0 1 46  50
0 0 1 1 1 1 1 0 0 2 1 1 44  79
0 0 1 1 0 0 0 0 0 1 0 1 45  50
0 0 0 0 1 0 0 0 0 1 0 1 51  89
1 1 1 1 1 0 1 1 1 2 0 2 10  29
0 0 0 0 1 0 0 0 0 2 0 1 42  80
1 1 1 1 0 0 0 0 1 2 1 1 21  39
0 1 0 0 1 0 0 0 0 1 0 1 51  54
1 1 0 0 0 0 0 0 0 1 1 1 46  77
0 0 0 0 0 0 0 0 0 1 0 1 52  85
0 0 0 0 0 0 0 0 0 1 0 1 52  98
1 1 1 1 0 1 1 0 1 2 1 1 43  87
0 1 0 0 0 0 0 0 0 1 1 1 52  99
0 0 0 0 0 0 0 0 0 1 0 1 42  84
0 0 0 0 0 0 0 0 0 1 0 1 48  95
0 1 0 0 0 0 0 0 1 3 0 1 49  80
0 0 0 0 0 0 0 0 0 1 0 1 47  79
1 1 0 0 1 0 0 0 1 3 0 2 38  29
0 1 0 0 0 0 0 0 0 1 0 1 43  81
0 0 0 0 1 0 0 0 0 1 0 1 44  96
1 1 1 1 1 0 0 0 0 2 1 . 27  50
0 0 0 0 0 0 0 0 0 1 1 1 45  73
0 0 0 0 0 0 0 0 0 1 0 1 51  95
0 0 0 0 0 0 0 0 0 1 0 1 48  60
0 0 0 0 0 0 1 0 0 1 0 1 50  76
0 1 0 0 0 0 0 0 0 1 0 1 47  86
0 0 0 0 0 0 0 0 0 1 0 1 48  53
1 1 0 0 1 0 0 0 0 1 1 1 40  50
0 1 0 1 1 0 1 0 0 3 1 1 41  47
0 0 0 0 0 0 0 0 0 1 0 1 50  93
0 0 0 0 0 0 0 0 0 1 0 1 48  90
0 0 0 0 0 0 0 0 0 1 0 1 41  65
0 1 0 0 0 0 0 0 0 1 1 1 46  80
0 1 0 1 0 0 0 0 0 1 1 1 44  50
0 0 0 0 0 0 0 0 0 1 0 1 42  87
1 1 1 1 1 0 1 1 1 4 3 2  5  10
1 1 0 1 1 0 0 0 0 2 1 1 28  64
0 1 0 0 0 0 0 0 0 2 1 1 51  91
0 1 0 0 1 0 0 0 0 1 0 1 51 100
0 0 0 0 0 0 0 0 0 1 0 1 52  95
0 1 1 1 1 1 0 0 0 2 2 2 27  50
1 1 1 1 1 0 0 0 1 2 0 2 32  55
1 1 1 1 1 1 0 0 0 2 0 1  6  50
0 1 0 0 1 0 0 0 0 1 0 1 49  93
0 0 0 0 0 0 0 0 0 1 0 1 52  93
0 1 0 0 0 0 0 0 0 1 0 1 51  94
0 0 0 0 0 0 0 0 0 1 0 1 51  96
0 0 0 0 0 0 0 0 0 1 0 1 52  79
1 1 1 0 1 0 0 0 1 2 0 2 47  79
0 0 0 0 0 0 0 0 0 1 0 1 51  91
end
label values lc_cough_m_b ny
label values lc_sleep_m_b ny
label values lc_memory_m_b ny
label values lc_concen_m_b ny
label values lc_musc_pain_m_b ny
label values lc_tastesmell_m_b ny
label values lc_diarrhoea_m_b ny
label def ny 0 "No", modify
label def ny 1 "Yes", modify
label values EQ5D_mob_m_revised care_revised
label values EQ5D_self_m_revised care_revised
label def care_revised 0 "No Problems", modify
label def care_revised 1 "Some Problems or Unable", modify
label values mrc_m_revised mrc_m_revised
label def mrc_m_revised 1 "1", modify
label def mrc_m_revised 2 "2", modify
label def mrc_m_revised 3 "3", modify
label def mrc_m_revised 4 "4 or 5", modify
label values PHQ4_final_m phq4_final
label def phq4_final 0 "Normal", modify
label def phq4_final 1 "Mild", modify
label def phq4_final 2 "Moderate", modify
label def phq4_final 3 "Severe", modify
label values EQ5D_act_m care
label def care 1 "No Problems", modify
label def care 2 "Some Problems", modify

Tags: None

Announcement

Checking for global maximum in LCA (latent class analysis)