Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • "Group variable" with mi estimate: xtmixed

    Hello. I am running a linear mixed model with imputed data. Here is the code in generalized form:

    Code:
                **##### Linear Mixed Model ~~~~ ~~~~ 
                mi estimate, dots: xtmixed outcome i.treatment##i.timepoint i.sex c.age i.district i.other_programs c.covid1 c.covid2 c.covid3 || cluster_id: || child_id: , reml cov(unstructured)
    There are N=555 participants in this particular sample.

    The output Group variable table shows fewer participants, probably due to missingness somewhere.

    Code:
    Multiple-imputation estimates                   Imputations       =         20
    Mixed-effects REML regression                   Number of obs     =      1,042
    
     Grouping information
    -------------------------------------------------------------
                             |     No. of            Observations per group
    Group variable |     groups    Minimum    Average    Maximum
            ----------------+--------------------------------------------
        cluster_id~r |        183          1        5.7         20
              child_id |        551          1        1.9          2
            -------------------------------------------------------------
    The table shows 551 children are included in the estimation.

    I want to know the N of children in the control group versus the treatment group. Everything I have tried returns the statistics from the full sample, not the estimation sample. I need these values because I am trying to calculate the 95% CI around the effect size using the calculator from Campbell Collaboration.

    Here are some things I have tried (this is a little embarrassing but you can see I am sort of working blind here):


    Code:
    *** Compute quantities of interest
            local nCols: colsof e(N_g)
            local nObs = e(N_g)[1,`nCols'] 
    
    *** Treatment levels  levelsof `treatment', local(treatLevs)
    local nObs_treat ""
    foreach lev of local treatLevs {
    qui count if `treatment' == `lev' local nObs_treat "`nObs_treat', N at Treatment=`lev', `nObs'" local ntreat`lev' "`nObs'"
    }

    Code:
    *** Treatment levels levelsof `treatment', local(treatLevs)
    local dvNew_tx ""
    foreach lev of local treatLevs {
      sum `dv' if `treatment' == `lev'
      local dvN_tx = r(N) if `treatment'==1
    local dvN_ct = r(N) if `treatment'==0
    local dvNew_tx "`dvNew_tx', dvN_tx at treatment==1, `dvN_tx', dvN_ct at treatment==0, `dvN_ct'"
      }
    Code:
    local nObs_treat ""
    foreach lev of local treatLevs {
    local nObs_treat "`nObs_treat', N at Treatment=`lev', `e(N_g)'" local nObs`lev' "` e(N_g)'"
    }

    I then called these macros as part of outreg2, e.g.:

    Code:
    *** Extracting Model Interaction Coefficient
    outreg2 using "${tables}/outcome_1105", dta replace sideway keep(1.treatment#1.timepoint) stats(coef ci pval) eform  ///
                 eqkeep(`e(depvar)') ctitle("Model Interaction"; "CI"; "pvalue")  ///
                 noaster nocon nonotes noobs noni paren(ci) ///
                 adds(dvMean, `s(dvMean)', dvSDev, `s(dvSDev)' `s(dvStats_tp)', N obs, `s(nObs)' `s(treatList)' `s(nObs_treat)')
    The above code just includes one of my attempts "`s(nObs_treat)')". None worked.

    I then tried returning the e(N_g) matrix after the estimation, but that only tells me the numbers I already know. I do not know how to manipulate matrices to get the estimation sample size by treatment group.

    Unfortunately I cannot share my data here, but I appreciate any input if available. Thank you.

  • #2
    Every estimation command that returns returns to e() supports -e(sample)-, which is something like a vector function over the dataset. Outside of the context of -mi-, you would call -xtmixed- and then immediately do something like this:

    Code:
    gen byte used = e(sample)
    Now you have a binary variable that indicates which observations were used, and which weren't, to do with as you please.

    By the way, you should update your syntax to use -mixed- which is the replacement of -xtmixed-.

    Comment


    • #3
      I think I missed this part of the original question

      I want to know the N of children in the control group versus the treatment group.
      So that would be something like

      Code:
      mixed ....
      gen byte used = e(sample)
      bysort child_id : gen which = _n==1
      tab treatment which // tabulation of number of children by treatment group
      tab child_id treatment if which // tabulation of each child id by treatment group

      Comment


      • #4
        Originally posted by Leonardo Guizzetti View Post
        Every estimation command that returns returns to e() supports -e(sample)-, which is something like a vector function over the dataset. Outside of the context of -mi-, you would call -xtmixed- and then immediately do something like this:

        Code:
        gen byte used = e(sample)
        Now you have a binary variable that indicates which observations were used, and which weren't, to do with as you please.

        By the way, you should update your syntax to use -mixed- which is the replacement of -xtmixed-.
        Thank you so much for this suggestion. I have tried it, and I am getting just zeroes. I am not sure if the fact that I am running a series of models makes a difference. I have updated the syntax following each model, e.g.:

        Code:
        gen byte outcome1_used = e(sample)  
        
        gen byte outcome2_used = e(sample)
        I place each statement immediately after the model:

        Code:
         **##### Linear Mixed Model ~~~~ ~~~~  
        mi estimate, dots: xtmixed outcome1 i.treatment##i.timepoint i.sex c.age i.district i.other_programs c.covid1 c.covid2 c.covid3 || cluster_id: || child_id: , reml cov(unstructured)
        
        gen byte outcome1_used = e(sample)
        What am I missing? Thank you again.

        Comment


        • #5
          Edit: scratch everything I say below the line. -mi estimate- has an -esample()- option. Use that.

          ~~~~~~~
          Like I said, this doesn't work following -mi- because the individual results from each iteration over the imputed datasets are not saved.

          I'm not able to code something up just now, but here's a sketch of what I'd try next. if you know from your imputation method that you will always have the same number of children included in each model for the imputed datasets, you could execute the model on just one imputed dataset using -mi xeq-. I think the following would work.

          Code:
          mi xeq 1: mixed ....
          mi xeq 1: gen byte outcome = e(sample)
          if the estimation sample varies, then that's more complicated and I would start by saving the model estimates for each imputed dataset using the -saving()- option of -mi estimate-, then using a loop to iterate over each one to do essentially the same process as above.

          Comment


          • #6
            Hello Leonardo,

            Circling back around to let you know that this worked. Ultimately though I went back through my imputation models to find out why the data were missing in the first place. Making the needed adjustments resulted in a complete dataset so the treatment/control numbers were no longer a mystery!

            Thank you so much for your input! I really appreciate it.

            Best wishes,
            Candace

            Comment


            • #7
              Thanks for the follow-up to close the thread and that you got it working.

              Comment

              Working...
              X