I need to estimate a 3-latent-class FMM regression (with a continous dependent variable) analyzing a data set that includes multiple observations within cases.
A hypothetical example of the data structure is:
The dependent variable, score, is continuous, and varies across observations (obsnum) within cases (caseid).
The predictor variables, predvar1-predvar3, are also continuous and also vary across observations within cases.
One covariate, covariate1, is continuous, and the other, covariate2, is binary.
Both covariates are constant across observations within cases and only vary across cases.
There are 300 cases with 48 observations within each case for a total of 14,400 observations.
If I did not have multiple observations within cases, I would structure the FMM regression as:
I could modify the code to address the correlated-response problem due to the multiple observations within cases by adding robust standard errors clustered on caseid as:
However, that would still allow me to predict posterior class membership probabilities only at the observation level within cases rather than at the case level, and I am not sure how I could then classify the cases into a latent class since the posterior class membership probabilities will vary within the cases.
I would appreciate any advice about how to structure the syntax for this 3-latent-class FMM regression in order to classify cases into the latent classes identified with FMM. Can this be done with -fmm- or -gsem- syntax?
(I'm familiar with and have used Pacifico's lclogit, from SSC, to conduct latent class conditional logistic regression for binary choice variables. My current project is a conjoint analysis where the choice variable is continuous, and I'm hoping I can use -fmm- or -gsem- for the analysis.)
Red Owl
Stata/IC 15.1, Windows 10 (64-bit)
A hypothetical example of the data structure is:
Code:
caseid obsnum score predvar1 predvar2 predvar3 covariate1 covariate2
ID01 1 55 3 10 11 0 0
ID01 2 72 8 9 6 30 0
ID01 3 36 10 11 2 30 0
. . . . . . . .
. . . . . . . .
. . . . . . . .
ID01 46 89 7 4 14 30 0
ID01 47 78 12 2 9 30 0
ID01 48 45 5 6 6 30 0
ID02 1 49 3 10 11 41 1
ID02 2 68 8 9 6 41 1
ID02 3 59 10 11 2 41 1
. . . . . . . .
. . . . . . . .
. . . . . . . .
ID02 46 61 7 4 14 41 1
ID02 47 96 12 2 9 41 1
ID02 48 40 5 6 6 41 1
. . . . . . . .
. . . . . . . .
. . . . . . . .
ID300 48 83 5 6 6 27 0
The predictor variables, predvar1-predvar3, are also continuous and also vary across observations within cases.
One covariate, covariate1, is continuous, and the other, covariate2, is binary.
Both covariates are constant across observations within cases and only vary across cases.
There are 300 cases with 48 observations within each case for a total of 14,400 observations.
If I did not have multiple observations within cases, I would structure the FMM regression as:
Code:
fmm 3, lcprob(covariate1 covariate2): regress score predvar1 predvar2 predvar3
Code:
fmm 3, lcprob(covariate1 covariate2) vce(cluster caseid): regress score predvar1 predvar2 predvar3
I would appreciate any advice about how to structure the syntax for this 3-latent-class FMM regression in order to classify cases into the latent classes identified with FMM. Can this be done with -fmm- or -gsem- syntax?
(I'm familiar with and have used Pacifico's lclogit, from SSC, to conduct latent class conditional logistic regression for binary choice variables. My current project is a conjoint analysis where the choice variable is continuous, and I'm hoping I can use -fmm- or -gsem- for the analysis.)
Red Owl
Stata/IC 15.1, Windows 10 (64-bit)
Comment