Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • LCA - mix of categorical and continuous indicators

    Dear Stata Users.
    Is there a way in Stata to include a mix of categorical and continuous indicators (i.e. observed variables) in Latent Class(/Later Profile) Analysis?

    Say I have 3 indicators: education, income, occupation.
    If there were all categorical I would do something like (for example for a model with 3 latent classes):

    gsem (education income occupation<-,mlogit) ///
    , nocapslatent logit lclass(C 3) ///
    startvalues(randomid, draws(15) seed(123321)) ///
    em(iter(5)) ///
    nodvheader ///
    nonrtolerance
    estat lcprob


    But what if occupation is a continuous scale?

    Any suggestion is greatly appreciated.
    Thanks
    Anna

  • #2
    Easily.

    Code:
    gsem (education income <-,mlogit) (occupation <- _cons, family(gaussian)) ///
    , nocapslatent lclass(C 3) ///
    startvalues(randomid, draws(15) seed(123321)) ///
    em(iter(5)) ///
    nodvheader ///
    nonrtolerance
    estat lcprob
    Some minor cautions. You're treating education and income as un-ordered categorical indicators, but it seems more likely that they're coded as ordered categorical. If so, think about using ologit. I suspect the same is true of occupation, and it is probably best practice not to treat an ordered categorical variable as a continuous one.

    I note that you are using the nonrtolerance option. You presumably have seen the related discussions about that option on the forum. Please do note that it's highly recommended to re-fit your final model without the nonrtolerance option - and by "highly recommended," I mean that I would personally do this.

    Last, you're using 15 draws for start values. As you increase the number of latent classes, I'd increase the number of random draws. I'd read Kathryn Masyn's chapter on latent class/profile analysis as cited in our SEM manual. I think that she recommends as many as 100 random draws. In my experience, this might not be necessary with 3-4 classes. However, I'd definitely recommend it starting with 5 latent classes.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment

    Working...
    X