Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • gsem and partially nested design

    I have a partially nested three level model. The study randomised by sites and in each site I have coaches which just covered intervention groups. This is my model for outcome


    mixed q_6m treat q_b || site: ||///

    therapist:treat, nocons reml ///

    residuals(independent, by(treat)) ///

    dfmethod(sat)


    I have the same for cost. I want to use equation model using gsem. Do you know how can I do this? I have tried

    Gsem (cost6 <- cost_b treat M1[site] M2[therapist<site]* treat) (Y_6m <- treat Y_B M1[site] M2[therapist<site]* treat) . I don’t know if it is correct or not and I couldn’t add

    , nocons reml ///

    residuals(independent, by(treat)) ///

    dfmethod(sat)

    on it.

  • #2
    Carlo Lazzaro Nick Cox Clyde Schechter could you pleas help me with this issue?

    Comment


    • #3
      It would be very helpful if you could share a portion of your data using the dataex command (ssc search dataex if you are on an older version of Stata). gsem uses full maximum likelihood. If you prefer to use restricted maximum likelihood with the Satterthwaite df correction, then you will have to use mixed. You can, however, trick mixed to do bivariate multilevel models. Note that I'm not certain you can do a mixed version of the full model you show for q_6m. In particular, it might not be possible to allow for different residuals for the treatment groups.

      For a version of your model without that particular complexity, you would need to first make both your outcome and baseline variables have the same variable name stub but with different endings. Then you reshape your data to long:
      Code:
      gen out1 = q_6m
      gen out_b1 = q_b
      gen out2 = cost6
      gen out_b2 = cost_b
      
      * Note I assume you have a subject id called sid
      reshape long out out_b, i(sid) j(type)
      Then you can estimate the bivariate mixed model as follows:
      Code:
      mixed out i.treat##i.type out_b || site: || therapist: treat ///
           || sid: , resid(un, t(type)) reml dfmethod(sat)  // note: you may need to add noconstant as an option
      See also this thread.
      Last edited by Erik Ruzek; 06 Sep 2024, 13:53. Reason: Edited mixed code

      Comment


      • #4
        Originally posted by Erik Ruzek View Post
        It would be very helpful if you could share a portion of your data using the dataex command (ssc search dataex if you are on an older version of Stata). gsem uses full maximum likelihood. If you prefer to use restricted maximum likelihood with the Satterthwaite df correction, then you will have to use mixed. You can, however, trick mixed to do bivariate multilevel models. Note that I'm not certain you can do a mixed version of the full model you show for q_6m. In particular, it might not be possible to allow for different residuals for the treatment groups.

        For a version of your model without that particular complexity, you would need to first make both your outcome and baseline variables have the same variable name stub but with different endings. Then you reshape your data to long:
        Code:
        gen out1 = q_6m
        gen out_b1 = q_b
        gen out2 = cost6
        gen out_b2 = cost_b
        
        * Note I assume you have a subject id called sid
        reshape long out out_b, i(sid) j(type)
        Then you can estimate the bivariate mixed model as follows:
        Code:
        mixed out i.treat##i.type out_b || site: || therapist: treat ///
        || sid: , resid(un, t(type)) reml dfmethod(sat) // note: you may need to add noconstant as an option
        See also this thread.
        Thank you so much for your response Erike. I want to know how can I add "nocons reml residuals(independent, by(treat)) dfmethod(sat)" to gsem (cost6 <- cost_b treat M1[site] M2[therapist<site]* treat) (Y_6m <- treat Y_B M1[site] M2[therapist<site]* treat).

        BW
        Hiro

        Comment


        • #5
          You cannot use restricted maximum likelihood (reml) with gsem. Likewise, you cannot use Satterthwaite standard errors with gsem. To the extent that you need those estimation options, you will have to use mixed.

          Added:
          You also haven't explained why you want to use gsem. Is it solely because you want to run a multivariate model? If so, the analysis you want to do can more or less be done with mixed, as shown in post #3.
          Last edited by Erik Ruzek; 06 Sep 2024, 14:36.

          Comment


          • #6
            Originally posted by Erik Ruzek View Post
            You cannot use restricted maximum likelihood (reml) with gsem. Likewise, you cannot use Satterthwaite standard errors with gsem. To the extent that you need those estimation options, you will have to use mixed.

            Added:
            You also haven't explained why you want to use gsem. Is it solely because you want to run a multivariate model? If so, the analysis you want to do can more or less be done with mixed, as shown in post #3.
            Thank you so much Erik! I am currently deciding on the type of estimation method to use, and gsem is one of the options that should be tested. What happens if I remove the reml option? Can I run the model without reml? Additionally, I would like to know if I have correctly specified the partially nested model. Also, is it possible to specify residuals by(treat)? I received an error when trying to do so.

            Comment


            • #7
              It might be possible to specify the gsem model in such a way that you could achieve something like resid(independent, by(treat)) but it would be hard for me to figure out without having the data. Joseph Coveney might have some insight into this issue. See this thread for a somewhat similar problem, for example. With regards to the partial nesting, I'm not sure I fully understand your data. Are the coaches encoded in the model in the therapist random effect?

              Either way, you will help your cause by using -dataex- to share some representative slice of your data. This is a very complicated analysis.
              Last edited by Erik Ruzek; 06 Sep 2024, 16:05. Reason: Added link to Statalist thread

              Comment


              • #8
                Originally posted by Erik Ruzek View Post
                It might be possible to specify the gsem model in such a way that you could achieve something like resid(independent, by(treat)) but it would be hard for me to figure out without having the data. Joseph Coveney might have some insight into this issue. See this thread for a somewhat similar problem, for example. With regards to the partial nesting, I'm not sure I fully understand your data. Are the coaches encoded in the model in the therapist random effect?

                Either way, you will help your cause by using -dataex- to share some representative slice of your data. This is a very complicated analysis.
                the data is look like the below table:
                pid therapist site treat cost_6m cost_b EQ5d_b Qaly_6m
                hj012 h01 M 1 12 20 0.32 0.89
                hj013 h01 M 1 12 18 0.56 0.65
                hj014 h01 M 1 22 20 0.23 0.56
                hj015 h02 G 1 2 0 0.12 0.65
                hj016 h02 G 1 14 5 0.65 0.78
                hj017 h03 G 1 . 12 0.15 0.89
                hj018 h03 G 1 16 20 0.32 0.89
                hj019 0 M 0 6 5 0.56 0.65
                hj020 0 M 0 5 1 0.23 0.56
                hj021 0 G 0 7 5 0.12 0.65
                hj022 0 G 0 1 1 0.65 0.78
                hj023 0 G 0 3 4 0.15 0.89

                Comment


                • #9
                  Thank you. How many sites do you have?

                  This thread is highly relevant. Please read it carefully: https://www.statalist.org/forums/for...gn-and-xtmixed

                  It does not look like you have coded your data as suggested in that thread and the linked article. Specifically, Scott Baldwin says that you need to assign a unique therapist ID for each individual in the control group such that each is a cluster of 1. Personally, I would focus on getting the data and model right for the univariate case. Folks might have ideas about the gsem multivariate case.

                  Comment


                  • #10
                    Originally posted by Erik Ruzek View Post
                    Thank you. How many sites do you have?

                    This thread is highly relevant. Please read it carefully: https://www.statalist.org/forums/for...gn-and-xtmixed

                    It does not look like you have coded your data as suggested in that thread and the linked article. Specifically, Scott Baldwin says that you need to assign a unique therapist ID for each individual in the control group such that each is a cluster of 1. Personally, I would focus on getting the data and model right for the univariate case. Folks might have ideas about the gsem multivariate case.
                    I have 14 site. Thanks for sharing that thread.

                    Comment


                    • #11
                      Originally posted by Erik Ruzek View Post
                      It would be very helpful if you could share a portion of your data using the dataex command (ssc search dataex if you are on an older version of Stata). gsem uses full maximum likelihood. If you prefer to use restricted maximum likelihood with the Satterthwaite df correction, then you will have to use mixed. You can, however, trick mixed to do bivariate multilevel models. Note that I'm not certain you can do a mixed version of the full model you show for q_6m. In particular, it might not be possible to allow for different residuals for the treatment groups.

                      For a version of your model without that particular complexity, you would need to first make both your outcome and baseline variables have the same variable name stub but with different endings. Then you reshape your data to long:
                      Code:
                      gen out1 = q_6m
                      gen out_b1 = q_b
                      gen out2 = cost6
                      gen out_b2 = cost_b
                      
                      * Note I assume you have a subject id called sid
                      reshape long out out_b, i(sid) j(type)
                      Then you can estimate the bivariate mixed model as follows:
                      Code:
                      mixed out i.treat##i.type out_b || site: || therapist: treat ///
                      || sid: , resid(un, t(type)) reml dfmethod(sat) // note: you may need to add noconstant as an option
                      See also this thread.
                      Hi Erik,
                      In this regression, by including type in the residuals, I can account for the joint association of the two equations. However, the partially nested structure of my model is not being considered. My data is designed with a partially nested structure, where patients in both groups are in different sites, and in the intervention group, patients within these sites are served by different coaches.

                      When I include treatment in the residuals, I encounter an error saying "convergence not achieved." Another concern is that since sid is repeated for both cost and QALY, I’m not sure if the time variable (t) is being applied correctly in the model.
                      could you please help me?

                      BW,
                      Hiro

                      Comment


                      • #12
                        Hiro,

                        I think you will have to abandon the multivariate aspect of the analysis in order to properly accommodate the partially nested structure and you want there to be heteroskedastic residuals by treatment. If you can convince yourself that the residuals are homoscedastic for both outcomes (see the proposed test in post #7 in the thread linked earlier), then you could probably move forward with the multivariate analysis.

                        If this were my data, I would 1) get the two univariate models correctly specified and estimated, 2) test for heteroskedasticity of residuals by treatment/arm, and 3) if homoskedastic residuals are appropriate for both outcomes, then put them into the multivariate model.

                        Comment


                        • #13
                          Originally posted by Erik Ruzek View Post
                          Hiro,

                          I think you will have to abandon the multivariate aspect of the analysis in order to properly accommodate the partially nested structure and you want there to be heteroskedastic residuals by treatment. If you can convince yourself that the residuals are homoscedastic for both outcomes (see the proposed test in post #7 in the thread linked earlier), then you could probably move forward with the multivariate analysis.

                          If this were my data, I would 1) get the two univariate models correctly specified and estimated, 2) test for heteroskedasticity of residuals by treatment/arm, and 3) if homoskedastic residuals are appropriate for both outcomes, then put them into the multivariate model.
                          Thank you so much for your response. What happens if homoskedastic residuals are appropriate for only one of the outcomes?

                          Comment


                          • #14
                            Erik Ruzek
                            Another question: In the recommended thread, you mentioned, "The distinction between the fully nested model and partially nested model is separate from the distinction between homo- and heteroscedastic (in our paper, anyway). The former has to do with the random effects, and the latter has to do with the residuals. The -residual- option in -mixed- controls the residuals; therefore, just using that option doesn't affect whether the model is fully or partially nested."

                            Does this mean my estimation doesn't need to include the variable related to the partially nested design in the residuals?

                            Comment


                            • #15
                              Hiro,

                              The residual specification in this model is not related to the partially nested design. It relaxes the testable assumption that a single residual parameter is appropriate for your model. You have partial nesting whether or not you estimate separate residuals by treatment status.

                              Comment

                              Working...
                              X