Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • bootstrap problem

    Hi every body,

    I am doing a mixed effect model on an imputed data set " mi estimate: mixe cot_at18 group cost_baseline || site: || cid:group, res(ind, by(group)) it works but in my program to do bootstrap it doesnt work.

    cap program drop test
    program define test, rclass
    mi estimate: mixe cot_at18 group cost_baseline || site: || cid:group, res(ind, by(group))
    martix list e(b_mi)
    matrix b =e(b_mi)
    sclar icost= b(1,1)
    return scalar icost=icost
    end

    bootstrap doff_cost=icost, rep(2) seed(12345) cluster(_mi_m) saving(test.dta, replace): test

    could you please help me.
    Thank you so much

  • #2
    Combining bootstrapping with imputed data requires extra work. I have outline two solutions here: https://www.preprints.org/manuscript/202401.0813/v1
    Best wishes

    (Stata 16.1 MP)

    Comment


    • #3
      note that I asked tech support about this several years ago and here is their response and code which I have, I think, previously put on Stata list (along with my original question, which has the full cite):
      Code:
       I took a look at the article by Schomaker, M and Heumann, C (2018).
      If I understood correctly, actually "Method 2,MI Boot" is the easiest to implement.  
      Here is an example:      
      
      webuse mhouses1993s30,clear      
      mi estimate, vceok: regress price tax sqft, ///        
      vce(bootstrap, reps(10) seed(123))      
      
      Notice that -vceok- is an undocumented option, which allows for unsupported VCE estimators.  And below is an example of how I would implement "Method 1,MI Boot (pooled sample [PS])":      
      
      webuse mhouses1993s30,clear      
      mi convert flong      
      mi query      
      local M = r(M)      
      set seed 123      
      forvalues i = 1/`M' {        
      quietly regress price tax sqft if _mi_m == `i', ///                
      vce(bootstrap, saving(file`i',replace) reps(10))      
      }      
      use file1,clear        
      forvalues i = 2/`M' {        
      append using file`i'        
      erase file`i'.dta      
      }      
      erase file1.dta      
      bstat *      
      estat bootstrap, percentile      
      
      Notice that, for this method, you must work on flong -mi- style, which is why I use -mi convert-.       I hope this is useful.  
      Sincerely,  Miguel  ************************* Miguel Dorta Senior Statistician [email protected] StataCorp LLC 4905 Lakeway Drive College Station, TX 77845 *************************   You wrote:  
      
      -----Begin Original Message----- Hi,  I have already imputed the missing values (mP) and wish now to bootstrap the model settled on using the MI data.  According to Schomaker, M and Heumann, C (2018), "Bootstrap inference when using multiple imputation", _Statistics in Medicine_, 37: 2252-2266, there are 3 acceptable strategies for combining MI and bootstrap; since I have already imputed the data, and analyzed the imputed data, I don't want to use their strategy 4 (bootstrap, then MI; see p. 2255 (note that they found that their strategy 3 was not acceptable). I note that UCLA has a FAQ on this issue but it appears to implement strategy 4. I attach the article.  So I wish to use either their strategy 1 (MI, then bootstrap and estimate and use pooled sample) or their strategy 2 (MI, then bootstrap). (p. 2255 of Schomaker/Heumann article)  I think that strategy 1 would be "easier" and here is more detail on that: bootstrap each imputed data set so have mXb data sets; in each of these data sets, estimate quantity of interest (here several coefficients). Save a pooled sample of ordered estimates and construct CI based on percentiles of the ordered estimates. Although they say to bootstrap first and then estimate each data set, it is not clear to me how to do this; it is also not clear that this is the best way to implement this strategy (i.e., maybe bootstrap one data set, estimate and save the estimates and then bootstrap (etc.), again????)  So, what is the best way to produce the bootstrap data sets? is there a way to produce all data sets at once and estimate (using, e.g., bootstrap) or is it better to bootstrap one data set and estimate (using, e.g., bsample) and post the results, building up a total that way? Alternatively, can I use bootstrap across the m data sets (using strata or cluster) to do all at once? Or, do I need to do within each value of m using statsby (or runby) to save the results?  I note that the N for each imputed data set is 486 so I am, generally, not worried about speed (as the data set is small).
      Last edited by Rich Goldstein; 18 Nov 2024, 13:19.

      Comment


      • #4
        Rich Goldstein

        Thank you so much for your response.
        i did use "mi_imput ed.dta", clear

        * Convert the dataset to long format for imputation
        mi convert flong

        * Get the number of imputations
        mi query
        local M = r(45)

        * Set random seed for reproducibility
        set seed 123

        * Initialize variables for storing delta_cost and delta_qaly
        local delta_cost = 0
        local delta_qaly = 0

        * Perform mixed model regression and compute delta_cost and delta_qaly
        forvalues i = 1/`M' {
        mixed y1 i.cost i.cost#i.group i.cost#c.x || site: || cid:group || pid:, nocons reml res(uns, t(cost)) if _mi_m == `i'

        * Store coefficients for delta cost and delta qaly
        local delta_cost = _b[1,6]
        local delta_qaly = _b[1,4]
        }

        * Compute the average delta_cost and delta_qaly over all imputations
        local delta_cost = `delta_cost' / `M'
        local delta_qaly = `delta_qaly' / `M'

        * Return scalars for delta cost and delta qaly
        return scalar delta_cost = `delta_cost'
        return scalar delta_qaly = `delta_qaly'

        * Bootstrap for confidence intervals
        forvalues i = 1/`M' {
        quietly regress y1 x1 sqft if _mi_m == `i', vce(bootstrap, saving(file`i', replace) reps(500))
        }

        * Combine the bootstrap results
        use file1, clear
        forvalues i = 2/`M' {
        append using file`i'
        erase file`i'.dta
        }

        * Clean up the temporary files
        erase file1.dta

        * Summary statistics and confidence intervals for the bootstrap results
        bstat *
        estat bootstrap, percentile
        but i am not confident on it could you please help me to modify it?
        Last edited by Hiro farabi; 18 Nov 2024, 15:23.

        Comment

        Working...
        X