Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with calculating the SDs for between and within measurements (Precision test)

    Dear all in Statalist,

    In our studies, we perform Precision test (closeness between repeated measurements) for the same blood sample analyzed in 3 analyzers of the same model placed at 3 different laboratories (Sites). The blood sample is analyzed in duplicates (Run) for 8 consecutive days (Day). The results are then analyzed in Analyse-it add-in for Excel to calculate the "Between" and "Within" SDs and CVs (%) for the Sites, Days, and Runs. The analysis in Analyse-it is called Measurement systems analysis (MSA).

    As we try to perform all the statistical analyses in Stata now, I need help with performing the Precision test in Stata. By using the -xtset- and -xtsum- commands, I was able to get very similar results for the SDs of Between Day, Within Site, and Between Site; but not for Within Run and Between Run. I wonder if there is other ways for calculating the SDs and the CVs(%) for the different groups (within and between) that generate exactly the same results as those in Analyse-it based on the mean of T3 (3.871562)? And maybe the same table generated in Analyse-it (see the last code)?

    My data look like this:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str8 id str19 date byte(site day run) float t3
    "1170752+" "2017-10-04T08:59:35" 1 1 1 3.87
    "1170752+" "2017-10-04T09:00:57" 1 1 1 3.93
    "1170752+" "2017-10-04T13:38:17" 1 1 2 3.93
    "1170752+" "2017-10-04T13:40:41" 1 1 2 3.91
    "1170752+" "2017-10-06T07:14:34" 1 2 1 3.92
    "1170752+" "2017-10-06T12:52:25" 1 2 1 3.91
    "1170752+" "2017-10-06T15:10:00" 1 2 2 3.91
    "1170752+" "2017-10-06T15:11:21" 1 2 2 3.98
    "1170752+" "2017-10-09T07:33:06" 1 3 1 3.89
    "1170752+" "2017-10-09T07:38:45" 1 3 1 3.91
    "1170752+" "2017-10-09T12:41:51" 1 3 2 3.91
    "1170752+" "2017-10-09T12:43:10" 1 3 2 3.92
    "1170752+" "2017-10-10T07:34:53" 1 4 1 3.92
    "1170752+" "2017-10-10T07:36:15" 1 4 1 3.92
    "1170752+" "2017-10-10T15:23:59" 1 4 2 3.85
    "1170752+" "2017-10-10T15:25:20" 1 4 2 3.95
    "1170752+" "2017-10-11T07:25:29" 1 5 1 3.95
    "1170752+" "2017-10-11T07:26:54" 1 5 1 3.94
    "1170752+" "2017-10-11T15:04:10" 1 5 2  3.9
    "1170752+" "2017-10-11T15:05:27" 1 5 2 3.92
    "1170752+" "2017-10-12T07:34:30" 1 6 1 3.92
    "1170752+" "2017-10-12T07:35:45" 1 6 1  3.9
    "1170752+" "2017-10-12T13:34:37" 1 6 2 3.94
    "1170752+" "2017-10-12T13:35:55" 1 6 2 3.91
    "1170752+" "2017-10-13T07:18:52" 1 7 1 3.87
    "1170752+" "2017-10-13T07:20:06" 1 7 1 3.93
    "1170752+" "2017-10-13T14:17:51" 1 7 2 3.86
    "1170752+" "2017-10-13T14:19:15" 1 7 2 3.89
    "1170752+" "2017-10-16T08:42:13" 1 8 1 3.87
    "1170752+" "2017-10-16T08:43:29" 1 8 1  3.9
    "1170752+" "2017-10-16T14:39:51" 1 8 2  3.9
    "1170752+" "2017-10-16T14:41:06" 1 8 2 3.87
    "1170752+" "2017-09-26T13:33:15" 2 1 1 3.95
    "1170752+" "2017-09-26T14:29:13" 2 1 1 3.83
    "1170752+" "2017-09-26T16:06:37" 2 1 2 3.89
    "1170752+" "2017-09-26T16:07:47" 2 1 2 3.88
    "1170752+" "2017-09-27T08:55:48" 2 2 1 3.92
    "1170752+" "2017-09-27T08:57:11" 2 2 1 3.86
    "1170752+" "2017-09-27T17:00:45" 2 2 2 3.91
    "1170752+" "2017-09-27T17:01:57" 2 2 2 3.88
    "1170752+" "2017-09-28T10:26:13" 2 3 1 3.87
    "1170752+" "2017-09-28T10:35:04" 2 3 1 3.88
    "1170752+" "2017-09-28T20:22:30" 2 3 2  3.9
    "1170752+" "2017-09-28T20:25:04" 2 3 2 3.82
    "1170752+" "2017-09-29T08:30:44" 2 4 1 3.83
    "1170752+" "2017-09-29T08:31:59" 2 4 1 3.82
    "1170752+" "2017-09-29T16:39:24" 2 4 2  3.8
    "1170752+" "2017-09-29T16:42:02" 2 4 2 3.87
    "1170752+" "2017-10-02T08:36:05" 2 5 1 3.81
    "1170752+" "2017-10-02T08:39:45" 2 5 1 3.81
    "1170752+" "2017-10-02T17:05:34" 2 5 2 3.78
    "1170752+" "2017-10-02T17:08:39" 2 5 2 3.84
    "1170752+" "2017-10-04T08:15:45" 2 6 1 3.95
    "1170752+" "2017-10-04T08:17:48" 2 6 1 3.88
    "1170752+" "2017-10-04T16:11:11" 2 6 2 3.91
    "1170752+" "2017-10-04T16:14:31" 2 6 2 3.91
    "1170752+" "2017-10-06T09:25:57" 2 7 1 3.81
    "1170752+" "2017-10-06T09:27:11" 2 7 1 3.84
    "1170752+" "2017-10-06T19:58:50" 2 7 2 3.84
    "1170752+" "2017-10-06T20:04:25" 2 7 2  3.8
    "1170752+" "2017-10-07T13:59:14" 2 8 1  3.8
    "1170752+" "2017-10-07T14:01:36" 2 8 1 3.83
    "1170752+" "2017-10-07T15:49:51" 2 8 2 3.84
    "1170752+" "2017-10-07T15:51:05" 2 8 2 3.78
    "1170752+" "2017-10-11T07:29:34" 3 1 1 3.89
    "1170752+" "2017-10-11T07:30:57" 3 1 1 3.85
    "1170752+" "2017-10-11T15:08:08" 3 1 2 3.86
    "1170752+" "2017-10-11T15:09:26" 3 1 2 3.96
    "1170752+" "2017-10-12T07:38:28" 3 2 1 3.87
    "1170752+" "2017-10-12T07:39:53" 3 2 1 3.86
    "1170752+" "2017-10-12T13:38:39" 3 2 2 3.96
    "1170752+" "2017-10-12T13:39:59" 3 2 2 3.92
    "1170752+" "2017-10-13T07:22:53" 3 3 1 3.85
    "1170752+" "2017-10-13T07:24:07" 3 3 1 3.85
    "1170752+" "2017-10-13T14:21:21" 3 3 2 3.84
    "1170752+" "2017-10-13T14:22:35" 3 3 2 3.81
    "1170752+" "2017-10-16T08:46:11" 3 4 1 3.81
    "1170752+" "2017-10-16T08:47:31" 3 4 1 3.81
    "1170752+" "2017-10-16T14:43:48" 3 4 2 3.83
    "1170752+" "2017-10-16T14:45:05" 3 4 2 3.79
    "1170752+" "2017-10-17T07:24:32" 3 5 1  3.8
    "1170752+" "2017-10-17T07:26:29" 3 5 1 3.81
    "1170752+" "2017-10-17T15:15:41" 3 5 2  3.8
    "1170752+" "2017-10-17T15:16:56" 3 5 2 3.91
    "1170752+" "2017-10-18T06:50:14" 3 6 1 3.79
    "1170752+" "2017-10-18T07:37:53" 3 6 1 3.91
    "1170752+" "2017-10-18T07:39:13" 3 6 2 3.86
    "1170752+" "2017-10-18T13:40:53" 3 6 2 3.86
    "1170752+" "2017-10-19T07:59:16" 3 7 1 3.87
    "1170752+" "2017-10-19T08:00:46" 3 7 1 3.85
    "1170752+" "2017-10-19T13:42:18" 3 7 2  3.8
    "1170752+" "2017-10-19T13:44:14" 3 7 2 3.82
    "1170752+" "2017-10-20T07:18:04" 3 8 1 3.82
    "1170752+" "2017-10-20T07:19:33" 3 8 1 3.85
    "1170752+" "2017-10-20T14:51:12" 3 8 2 3.85
    "1170752+" "2017-10-20T14:52:32" 3 8 2 3.87
    end
    Here are the codes I used to calculate the SDs. Yet, the results are not exactly the same as those generated in Analyse-it.
    Code:
    /* similar results */
    xtset day
    xtsum t3
    
    /* similar results */
    xtset site
    xtsum t3
         
    /* different results */
    xtset run
    xtsum t3
    Here are the results generated by Analyse-it.
    Code:
      
    Component CV Exact/Satterthwaite 95% CI SD Expected SD/CV p-value
    Within Run 0,9% 0,8% to 1,1% 0,035 2,5%
    1,00001
    Between Run 0,0% 0,000
    Between Day 0,7% 0,028
    Within Site 1,2% 1,0% to 1,4% 0,045
    Between Site 0,8% 0,031
    Total 1,4% 1,0% to 2,3% 0,054 2,5%
    0,98661
    H0: σ ≤ σ0 The imprecision is less than or equal to the expected imprecision. H1: σ > σ0 The imprecision is greater than the expected imprecision. 1 Do not reject the null hypothesis at the 5% significance level.
    I appreciate you help.

  • #2
    Why is Analyse-it reporting the total (I assume this the same as the overall) SD as 0.054?

    Code:
    . sum t3
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
              t3 |         96    3.871562    .0488032       3.78       3.98

    Comment


    • #3
      Dear Scott. Sorry for the late reply. The calculations in Analyse-it are based on the standard EP05-A3 for evaluation of the precision. When I run the ANOVA test in Stata, I get exactly the same results as in Analyse-it.

      Code:
      anova t3 site / day|site / run|day|site
      
                               Number of obs =         96    R-squared     =  0.7390
                               Root MSE      =    .035074    Adj R-squared =  0.4835
      
                        Source | Partial SS         df         MS        F    Prob>F
                  -------------+----------------------------------------------------
                         Model |  .16721602         47   .00355779      2.89  0.0002
                               |
                          site |  .06881893          2   .03440947      8.99  0.0015
                      day|site |  .08037207         21   .00382724  
                  -------------+----------------------------------------------------
                      day|site |  .08037207         21   .00382724      5.10  0.0001
                  run|day|site |  .01802502         24   .00075104  
                  -------------+----------------------------------------------------
                               |
                      Residual |  .05905011         48   .00123021  
                  -------------+----------------------------------------------------
                         Total |  .22626613         95   .00238175
      I used the ANOVA results above and the formulas in the EP05-A3 standard to replicate the results for the within and between SDs and got the following:

      Code:
      Verror=MSerror=0.00123021
      Vday= (MSday – MSerror)/nrep=(0.00382724-0.00123021)/2=0.00064926
      Vsite= (MSsite – MSday)/nday nrep=(0.03440947-0.00382724)/8*2=0.00191138
      Vrun= (MSrun – MSerror)/nrep=(0.00075104 -0.00123021)/2=-0.0002396
      
      Just to check: 
      Within Run: SDR=√Verror=√0.00123021=0.0351
      Within Site: SDWS=√Vday+Verror=√0.00064926+0.00123021=0.04335279
      
      Now the total SD:
      SDTOTAL=√Vsite+Vday+Vrun+Verror=√0.00191138+0.00064926+(-0.0002396)+0.00123021=0.05959249
      
      V=Variance
      Minor deviations from the Analyse-it results could be due to number of decimals used in the calculations.
      Now I need to get exactly the same SD results (or very close ones) as in Analyse-it. It would be very complicated for me if I need to calculate all the SDs manually each time I run this test. Is there any code that make it easier for me? Maybe the -xtset- and -xtsum- commands need to be modified a bit?

      Thank you in advance!

      Comment


      • #4
        Here an attempt using the information return from -anova-
        Code:
        anova t3 site / day|site / run|day|site
        
        local V_error = (e(rss)/e(df_r))
        local V_r = max(0,((e(ss_3)/e(df_3)) - (e(rss)/e(df_r)))/2 ) /*num of obs per group*/
        local V_d = ((e(ss_2)/e(df_2)) - (e(rss)/e(df_r)))/(2*2)/*number of run*obs per group */
        local V_s = ( (e(ss_1)/e(df_1)) - (e(rss)/e(df_r)))/(2*2*8) /*number of days*runs*obs per group */
        local SD_total = round(sqrt(`V_s' + `V_d' + `V_r' + `V_error'), .001)
        local SD_r = round(sqrt(`V_r'),.001)
        local SD_d = round( sqrt(`V_d') , .001)
        local SD_s =round(sqrt(`V_s'), .001)
        local SD_w_s = round(sqrt(`V_d' + `V_error'), .001)
        local SD_error  = round(sqrt(`V_error'),.001)
        
         putexcel set test.xlsx,replace open
         putexcel A1 = "Component"
         putexcel B1 = "SD"
         putexcel A2 = "Error"
         putexcel A3 = "Between Run"
         putexcel A4 = "Between Days"
         putexcel A5 = "Within Sites"
         putexcel A6 = "Between Sites"
         putexcel A7 = "Total"
         putexcel B2 = `SD_error'
         putexcel B3 = `SD_r'
         putexcel B4 = `SD_d'
         putexcel B5 = `SD_w_s'
         putexcel B6 = `SD_s'
         putexcel B7 = `SD_total'
         putexcel close
         import excel "C:\Users\Scott\Desktop\tmp\test.xlsx", sheet("Sheet1") clear firstrow
        l,sep(0) noobs
        which returns:

        Code:
          +----------------------+
          |     Component     SD |
          |----------------------|
          |         Error   .035 |
          |   Between Run      0 |
          |  Between Days   .025 |
          |  Within Sites   .043 |
          | Between Sites   .032 |
          |         Total   .054 |
          +----------------------+

        Comment


        • #5
          Perfect! Tanks a lot Scott for your help. I appreciate it!

          Comment


          • #6
            *Thanks

            Comment

            Working...
            X