Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Choosing appropriate statistical test/model for Before and After studies

    Hello Stata users.

    We recently conducted a prospective study on retention and viral load suppression among children and adolescents in HIV care. The study was conducted on children and adolescents receiving ART from two separate sites. Our one-year study from the two sites finally came to an end. Our interest currently is to see if the interventions we implemented helped to improve our two major outcomes [Rentention 1-"Retained" 0-"Not retained". AND Viral load suppression 1-"Suppressed" 0-"Not suppressed." We also collected data on some independent variables like ARV days dispensed, adherence scores, OVC enrollment status, tuberculosis status, and ARV regimen line, among others, both before and after the 12-month study.

    Our challenge now is to choose an appropriate statistical model/test to help us realize whether our interventions had a significant improvement on Retention and Viral load suppression.

    Kindly suggest an appropriate model we can adopt and probably the implementation of that model.

    Thank you

  • #2
    Complement linear probability models with a logit model, I would say. Include all the exogenous covariates you can. You have cross-sectional data right? You may want to use HC3 standard errors for conservative inference.

    Comment


    • #3
      Thank you Maxence Morlet . What's your viewpoint on analysing this data as Wide vs. long? That's my next challenge.

      Comment


      • #4
        Upon re-reading your post, it might seem you have panel data, in which case I would go long and the xtset the data.

        Comment


        • #5
          You are correct; I have panel data, which I have already formatted to Long. To bring both of us up to speed, I would like to re-echo that our outcomes are retention (1-retained; 0-not retained) and suppression (1-suppressed; 0-not suppressed)—some of the I.Vs include 1. sex (1-female; 2-male), 2. DSDM approach (1-FBIM; 2-FBG; 3-FTDR; 4-CCLAD; 5-CDDP), 3. disclosure status (1-disclosed; 2-not disclosed), and 4. school going status (1-school going, 2-not school going). The time variable is time (1-before; 2-after). These are just a few of them. Also to note; there are some I.Vs we collected during BEFORE INTERVENTION that were dropped in the AFTER INTERVENTION dataset and vice versa.

          Lastly, whereas we focused on children and adolescents active in HIV care at a point in time (April - June 2023 quarter), we didn't have any control groups in our study. Adolescents who joined the two clinics after the study commenced equally received the interventions henceforth.

          1. What's the effect of including I.Vs that were collected at a single period? How do we even interpret them in the results, OR is it even necessary to have such I.Vs included in the model?
          2. How do we proceed with the model buildup, from a bivariate to a multivariate model?
          3. I want to believe that each outcome will have a completely different model.

          Thanks a bunch.
          Last edited by Abraham Oluka; 02 Dec 2024, 12:18.

          Comment


          • #6
            Abraham:
            0) the two dependent variables cannot live together in the same -xt- regression equation. You can only investigate one of them at a time. Therefore, you should consider two panel data regression;
            1) panel with one observation only will be treated as singletons;
            2) if your dependent variable is binary, see -xtlogit-;
            3) see my point # 0).
            Kind regards,
            Carlo
            (StataNow 18.5)

            Comment


            • #7
              Carlo Lazzaro's point 0 caught my eye, because I remember reading various resources describing how linear mixed models could be used to estimate multivariate models (e.g., Chapter 7 in this book by Jos Twisk and Chapter 7 again in this book by Heck, Thomas & Tabata). Prior to today, I had not seen anything about multivariate versions of generalized linear mixed models. But a quick search turned up a 2021 article that looks interesting:
              Achana F, Gallacher D, Oppong R, Kim S, Petrou S, Mason J, Crowther M. Multivariate Generalized Linear Mixed-Effects Models for the Analysis of Clinical Trial-Based Cost-Effectiveness Data. Med Decis Making. 2021 Aug;41(6):667-684. doi: 10.1177/0272989X211003880. Epub 2021 Apr 5. PMID: 33813933; PMCID: PMC8295965.
              From the abstract (with emphasis added):

              In this article, we extend the generalized linear mixed-model framework to enable simultaneous modeling of multiple outcomes of mixed data types, such as those typically encountered in trial-based economic evaluations, taking into account correlation of outcomes due to repeated measurements on the same individual and other clustering effects. We provide new wrapper functions to estimate the models in Stata and R by maximum and restricted maximum quasi-likelihood and compare the performance of the new routines with alternative implementations across a range of statistical programming packages.
              I hope this helps!



              --
              Bruce Weaver
              Email: [email protected]
              Version: Stata/MP 18.5 (Windows)

              Comment


              • #8
                Thanks for the full references, Bruce. Never read about statistics for multivariate cost-effectiveness analysis.
                Kind regards,
                Carlo
                (StataNow 18.5)

                Comment


                • #9
                  biprobit?

                  Comment

                  Working...
                  X