Dear Statalist users,
I initially created a post here, where I was having difficulties understanding the basics of LPA syntax in Stata. Following the post here, I was able to replicate Masyn's (2013) LPA using startvalues(randompr, draws(5) seed(15)). I applied the same starting values uniformly across my 6 classes with 4 different model restrictions. My BIC statistics results are as follows:
Where DRE stands for discontinuous region encountered.
I noticed what looks like an erratic behaviour of my BIC values for the class invariant, unrestricted model (column four), which is probably due to the stringent assumption of being class-invariant. I've also noticed that as I increase the number of classes, Stata struggles quite a bit in providing results for the same model specification.
Now, comment # 7 in the same post recommends using startvalues(randompr, draws(50) seed(15)) emopts(iterate(10)) as one hits 5+ latent classes. I applied this criteria uniformly across my 6 class models with 4 different restrictions. My results look as follows:
At this point, I am very confused about when to use a set of starting values or another. Every time I use a different set my class profiles change markedly, and I am trying to avoid the trap of choosing the starting values that best fit my research expectations.
I'm leaning towards using Masyn's starting values, as in my first table. It just seems like a standard I can follow. But if anyone has some insights on this topic, I would be very grateful to discuss. Many thanks.
P.S. I am aware of the "gsem estimation options" document from the Stata manual. Unfortunately, I could not solve my problem after reading it.
I initially created a post here, where I was having difficulties understanding the basics of LPA syntax in Stata. Following the post here, I was able to replicate Masyn's (2013) LPA using startvalues(randompr, draws(5) seed(15)). I applied the same starting values uniformly across my 6 classes with 4 different model restrictions. My BIC statistics results are as follows:
class | BIC class-invariant, diagonal | BIC class-varying, diagonal | BIC class-invariant, unrestricted | BIC class varying, unrestricted |
1 | 6536.391 | 6536.391 | 5726.355 | 5726.355 |
2 | 6044.384 | 5982.513 | DRE | 5648.8 |
3 | 5923.718 | 5917.452 | 5563.118 | 5620.018 |
4 | 5915.317 | 5820.818 | 5587.027 | 5741.81 |
5 | 5898.285 | 5838.543 | 5731.829 | 5756.148 |
6 | 5843.54 | 5817.436 | 5259.08 | 5927.461 |
Now, comment # 7 in the same post recommends using startvalues(randompr, draws(50) seed(15)) emopts(iterate(10)) as one hits 5+ latent classes. I applied this criteria uniformly across my 6 class models with 4 different restrictions. My results look as follows:
class | BIC class-invariant, diagonal | BIC class-varying, diagonal | BIC class-invariant, unrestricted | BIC class varying, unrestricted |
1 | 6536.391 | 6536.391 | 5726.355 | 5726.355 |
2 | 6044.384 | 5982.513 | 5677.851 | 5648.8 |
3 | 5923.718 | 5932.882 | 5698.263 | 5620.018 |
4 | 5915.317 | 5820.818 | 5715.92 | 5773.844 |
5 | 5898.285 | 5846.176 | 5750.381 | 5780.88 |
6 | 5843.54 | 5818.873 | 5772.309 | 5924.519 |
I'm leaning towards using Masyn's starting values, as in my first table. It just seems like a standard I can follow. But if anyone has some insights on this topic, I would be very grateful to discuss. Many thanks.
P.S. I am aware of the "gsem estimation options" document from the Stata manual. Unfortunately, I could not solve my problem after reading it.
Comment