Latent class regression is an extension of latent class or profile analysis. Say you believe that class membership is influenced by some individual characteristic, but that characteristic is not a manifestation of your latent class. In example 50g of the SEM manual, there are 4 indicators, to which the example fits an LCA and essentially found that 27.9% of respondents were inclined to put society's needs above their own, and that 72.1% of people were selfish. Say we had gender in the data, and you wanted to see the effect of gender on the latent classes; you could do so through a multinomial regression. You could simply predict the modal class probabilities and tabulate gender by those probabilities, but that would ignore the classification uncertainty - i.e. we aren't always sure what latent class people are assigned to.
Now, I'm going to switch to the data for example 52, since that dataset has an additional covariate. In example 54, we fit a 3-class model to the data. The 3 classes identified appear to correspond to people with overt diabetes, chemical diabetes (I think this could be described as pre-diabetes), and normal people. Those should be classes 3, 2, and 1 if you run the model; I don't believe that you have to set any particular random seed. That is the same order as they appear in the manual example.
The example data has patient relative weight included. It appears to be a continuous variable, with a mean of 0.9, SD of 0.129. You might guess that higher relative weight should be associated with higher odds of chemical diabetes or overt diabetes. To use relative weight in a latent class regression, this is the syntax you'd use. I'm including the output from the multinomial regression part:
That was what I expected. But, say I wanted to use margins to, for example, calculate the marginal effect of relative weight on the probability of each outcome. I am having a lot of trouble finding the right syntax. -estat lcmean- and -estat lcprob- don't produce information on relative weight; they produce the same output as with the first latent profile syntax. For the record, I'm using Stata 15.1.
The gsem postestimation entry on margins says that -classpr- is an admissible statistic for margins. It also says that "classpr defaults to the first latent class if option class() is not specified." However, this doesn't appear to work as expected.
Neither of the above look like plausible changes in predicted probability to me. The syntax in the multinomial logit model postestimation doesn't work for me either. This is doubly odd, because this syntax was referenced in an earlier discussion on how to produce a profile plot after an LCA:
Any thoughts, or am I simply mis-interpreting something?
Now, I'm going to switch to the data for example 52, since that dataset has an additional covariate. In example 54, we fit a 3-class model to the data. The 3 classes identified appear to correspond to people with overt diabetes, chemical diabetes (I think this could be described as pre-diabetes), and normal people. Those should be classes 3, 2, and 1 if you run the model; I don't believe that you have to set any particular random seed. That is the same order as they appear in the manual example.
Code:
use http://www.stata-press.com/data/r15/gsem_lca2 gsem (glucose insulin sspg <- _cons), lclass(C 3) lcinvariant(none) covstructure(e._OEn, unstructured)
Code:
gsem (glucose insulin sspg <- _cons) (C <- relwgt), lclass(C 3) lcinvariant(none) covstructure(e._OEn, unstructured) Output: | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1.C | (base outcome) -------------+---------------------------------------------------------------- 2.C | relwgt | 14.03413 2.819101 4.98 0.000 8.508794 19.55947 _cons | -14.50264 2.864154 -5.06 0.000 -20.11628 -8.889005 -------------+---------------------------------------------------------------- 3.C | relwgt | 5.186345 2.045551 2.54 0.011 1.177138 9.195552 _cons | -5.329615 1.930139 -2.76 0.006 -9.112617 -1.546613
The gsem postestimation entry on margins says that -classpr- is an admissible statistic for margins. It also says that "classpr defaults to the first latent class if option class() is not specified." However, this doesn't appear to work as expected.
Code:
margins, dydx(relwgt) predict(classpr) Expression : Predicted probability (1.C), predict(classpr) dy/dx w.r.t. : relwgt ------------------------------------------------------------------------------ | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- relwgt | -1.640623 .2251536 -7.29 0.000 -2.081916 -1.19933 margins, dydx(relwgt) predict(classpr class(2)) Expression : Predicted probability (2.C), predict(classpr class(2)) dy/dx w.r.t. : relwgt ------------------------------------------------------------------------------ | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- relwgt | 1.672888 .2361939 7.08 0.000 1.209956 2.135819
Code:
margins, dydx(*) predict(outcome(1)) predict(outcome(2)) predict(outcome(3)) invalid outcome() option; depvar 1 not found r(198);
Comment