Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • latent class analysis - one last attempt before abandonign - initial values not feasible

    I have 5 ordinal variables representing a patient’s ability to do a separate task scored from 1-5.
    5 being able to, 1 (not able too)
    These tasks were measured pre-procedure and post procedure
    (Each task eg going up stairs is represented as qx in the variables below, each qx is a task)

    Variables are: preopq1 – postopq1 – preopq2- postopq2- preopq3 – postopq3- preopq4 – postopq4 – preopq5 – postopq5

    Aim: To determine Early improvers, late improvers

    Problem:
    1) For all qx I get the following error: -initial values not feasible-
    I have tried the following

    Code:
    gsem(postopq1 preopq1 <-, ologit), lclass(C 3)
    Using the stata help file (https://www.stata.com/manuals/semintro12.pdf) I tried the following Modifications (none worked):

    Code:
    ///Used iterations modification – selected 12 is this when i saw values change
    gsem(postopq1 preopq1 <-, ologit), iterate(12) lclass(C 3)
    
    /// Does not produce output at iterate(12) but instead proceeds to
    //initial values not feasible
    
    ////Tried the following techniques:
    
    gsem(postopq1 preopq1 <-, ologit), intmethod(ghermite) lclass(C 3)
    gsem(postopq1 preopq1 <-, ologit), intmethod(mvaghermite) lclass(C 3)
    gsem(postopq1 preopq1 <-, ologit), intmethod(laplace) lclass(C 3)
    gsem(postopq1 preopq1 <-, ologit), intmethod(mvcaghermite) lclass(C 3)
    
    gsem(postopq1 preopq1 <-, ologit) lclass(C 3) vsquish nodvheader noheader nolog startvalues(randomid, draw(20)) emopts(iterate(10))
    
    ///error: cannot compute an improvement – discontinuous region encountered
    I have tried all the above with each task variable
    -postopq2 preopq2-
    postopq3 – preopq3
    postopq4 preopq4
    postop5 preopq5

    The only method that worked was combining all the task variables together

    Code:
    gsem (postopq5 postopq4 postopq3 postopq2 postopq1 <- ologit) (C <- preopq5 preopq4 preopq3 preop3 preop2 preopq1), lclass(C 3)
    However this took a total of 5 hrs to run which showed the model isn’t good enough, no clear distant class (see attachment)
    -lcprob-

    No output generated after -lcmean after a total of 13 hours (still running)

    I tried to run with C 5, but too slow (kept it going for a whole night and noconvergence so it kept producing iterations without any output, so pressed break)

    I also have tried reducing my dataset:

    Code:
    sample 10 by(procedure)
    gsem (postopq1 preopq1 <-,ologit), class (C 3)
    
    ///i ALSO TRIED:
    
    contract preopq1 preopq3 preopq4 preopq5 postopq1 postopq2 postopq3 postopq4 postopq5, freq(fw)
    gsem (postopq1 preopq1 <-,ologit), class (C 3)
    
    
    ///initial values not feasible


    Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	41.9 KB
ID:	1742917


  • #2
    Hi Rose. I think the smaller reduced models are probably a dead end. I very much doubt your convergence issues are because you have too many variables. It's more likely you don't have enough information or there is a problem with the model specification. Have you tried including other data in the model? I might start with the model that converges and try to refine it into a better model by including more data and reworking/reconsidering the relationships, fixing certain values based on the initial results, etcetera. I would not take the 5 hour convergence as the red flag you seem to think it is. SEM models are notoriously difficult to get to converge. They take a fair amount of finessing and revision to get right. You're not estimating a basic linear regression here. Your struggles are normal.

    You might consider a principle components analysis to try to confirm your classes make sense.
    Last edited by Daniel Schaefer; 12 Feb 2024, 20:59.

    Comment


    • #3
      Helo ive actually managed to solve the problem reworkinng the relationships

      Code:
      gsem (score1 <- preopscore1, ologit) ///
      (score2 <- preopscore2, ologit)
      (score3 <- preopscore3, ologit)
      (score4 <- preopscore4, ologit)
      (C <- procedure) [fw=fw], lclass(C2)
      my score* represents postopq*
      I suppose its to be expected that the more classes one has fhe more aic value decreases but one should pick the model that makes sense using values from lcprob

      Comment


      • #4
        ive actually managed to solve the problem reworkinng the relationships
        I'm not totally sure what you're doing here since this is fairly different from where you started, but I'm glad to hear you've found a model that seems to work for you. It looks like C is entirely determined by procedure. I'm not seeing any direct relationship between scores and C. Is that intentional?

        I suppose its to be expected that the more classes one has fhe more aic value decreases
        Well, not universally, but you are right that AIC is sensitive to model parsimony. Edit: Oh, decreases. Yes AIC is also sensitive to the amount of information explained by the model. If you want to make sure you are as parsimonious as possible, check the BIC as well.

        Code:
        one should pick the model that makes sense using values from lcprob
        Yes, or at least, you should try to be holistic. Consider the overall F test for the model as well as the AIC and BIC, consider whether the classes make sense theoretically, and so on. It is not necessarily easy to reduce criteria for the right model down to a single statistic, or even a few statistics.
        Last edited by Daniel Schaefer; 13 Feb 2024, 13:43.

        Comment


        • #5
          Hi Daniel




          well, i can seem to come up with a hypothesis for any model … which makes it frustrating to validate what’s right




          My question: Are the probability of early improvers/ late improvers different between treatments




          Code:
          gsem (score1 <- preopscore1, ologit) ///
          
          (score2 <- preopscore2, ologit)
          
          (score3 <- preopscore3, ologit)
          
          (score4 <- preopscore4, ologit)
          
          (C <- procedure) [fw=fw], lclass(C2)



          Here, class depends on the procedure the patient has. The postop score is related to the preopscore , in theory yes this hypothesis works.




          Here the class the pt lies in depends on the procedure and preop scores q1-q4




          Code:
          
          
          
          gsem(score1 score2 score3 score4 <-, ologit) (C i.procedure preopscore1 preopscore2 preopscore3 preopscore4) [fw=count], lclass(C2)



          As someone experienced in this field, what would you advice using ?




          Any if I could be greedy, I’ve tried plotting the graph of probabilities using margins




          Any insight ?




          I would have like to see the subsections of score1 {ordinal variable 1-5) on the xaxis by procedure, instead when I used -margins- I have the Class 1 and Class 2 by procedure and respective probabilities on the y axis..




          Code:
          
          
          
          margins i.procedure, at(score1) predict(classpr class(1)) predict(classpr class(2))
          
          marginsplot, by(procedure)

          Comment


          • #6
            Sorry, I'm not clear on what you are asking here.

            what would you advice using ?
            Are you asking which model I think you should use? Given your research question and what I've seen so far, this one makes the most sense to me:

            Code:
            gsem(score1 score2 score3 score4 <-, ologit) (C <- i.procedure preopscore1 preopscore2 preopscore3 preopscore4) [fw=count], lclass(C2)
            But that assumes your postop scores each measure something caused by the early/late recovery classes. I don't necessarily think the first model is wrong either, just unexpected. I'm not in the weeds on this, you are, so you're going to have to use your own judgement here.

            Any if I could be greedy, I’ve tried plotting the graph of probabilities using margins Any insight ?
            Are you saying there is something wrong with your code here? Are you getting an error, and if so, what? This might be more appropriate as another question in a new thread.
            Last edited by Daniel Schaefer; 13 Feb 2024, 15:23.

            Comment

            Working...
            X