I'm trying to do an LCA for the first time. I have read the Stata manual, various papers about LCA, and various threads here. I think I'm at the point where I understand what various options do and why they are necessary, and therefore how to undertake the analysis in Stata. What I have done is:
This works for lclass(2) and lclass(3); at lclass(4) I hit convergence issues.
What I think I need to do next is add the nonrtolerance option, review the output and add constraints where there is perfect prediction by any answer category. I know there is a risk of nonrtolerance finding local rather than global maxima, so for each number of classes, I need to run the code ~100 times with different starting values, saving the LL estimates for later comparison. For each number of classes, I will check that most runs get to the same, highest, LL. The last step will be to choose the "right" number of classes using the BIC.
My questions:
Code:
gsem (v1 v2 v3<-, ologit) /// (v4 v5 v6<-, logit) /// (v7<-,), /// [pweight=weight] lclass(2) lcinvariant(none)
What I think I need to do next is add the nonrtolerance option, review the output and add constraints where there is perfect prediction by any answer category. I know there is a risk of nonrtolerance finding local rather than global maxima, so for each number of classes, I need to run the code ~100 times with different starting values, saving the LL estimates for later comparison. For each number of classes, I will check that most runs get to the same, highest, LL. The last step will be to choose the "right" number of classes using the BIC.
My questions:
- Are my proposed next steps correct?
- Are there any rules about adding constraints? At some point, doesn't the addition of constraints mean that I am creating the groups, rather than them emerging from the data?
- If I add constraints, will I be able to remove the nonrtolerance option? Is that something to aim for?
- Is there a rule of thumb for determining if enough starting values find the highest LL?
- Assuming I'm satisfied that I have found the global maximum when I compare the LLs, how do I then "use" that solution?
Comment