Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Inter rater reliability with missing observations

    Hi everyone,

    I am trying to calculate inter rater reliability for my data but am struggling due to missing data.

    In this dummy data (very similar to my own data but a smaller sample) I have 9 raters (1-9), who have scored (score) 4 Vignettes (1-4) out of 100. The 9 raters are constant throughout, however not all raters completed the questionnaire, meaning some vignettes have only been rated by 7 or 8 raters. My data is currently in long format

    e.g.
    ID Vignette Score
    1 1 8
    1 2 32
    1 3 8
    1 4 65
    2 1 16
    2 2 16

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float ID byte Vignette float score
    1 1  8
    1 2 32
    1 3  8
    1 4 65
    2 1 16
    2 2 16
    2 3  6
    2 4 50
    3 1 14
    3 2 14
    3 3 14
    3 4 32
    4 1  8
    4 2  8
    4 3 16
    4 4 32
    5 1  0
    5 2  0
    5 3 32
    5 4  .
    6 1 14
    6 2 32
    6 3 14
    6 4 16
    7 1  0
    7 2 16
    7 3  8
    7 4 60
    8 1 16
    8 2 14
    8 3  0
    8 4 65
    9 1  8
    9 2  0
    9 3  .
    9 4  .
    end
    (Apologies if this is not the correct way to post my data, please let me know!)

    My initial thought before encountering the missing data was to approach this by calculating the icc using a two-way random effects model, however stata excludes two vignettes due to the missing values. In my real dataset I have more vignettes and in some cases majority are excluded due to missing data.

    What is the best way to calculate inter-rater reliability for this data, taking into consideration the missing values?

    Thank you so much,
    Olivia

  • #2
    Note that Stata's icc command requires a balanced design. It will not only delete cases (observations) with missing values; it will also omit all vignettes that have not been rated by each of the 9 raters.

    kappaetc
    (SSC or SJ; at this point the two are identical) estimates inter-rater reliability for unbalanced designs and in the presence of missing values. With the data above

    Code:
    reshape wide score , i(Vignette) j(ID)
    kappaetc score* , icc(random)
    yields

    Code:
    (output omitted)
    
    . kappaetc score* , icc(random)
    
    Interrater reliability                           Number of subjects =       4
    Two-way random-effects model               Ratings per subject: min =       7
                                                                    avg =    8.25
                                                                    max =       9
    ------------------------------------------------------------------------------
                   |   Coef.     F     df1     df2      P>F   [95% Conf. Interval]
    ---------------+--------------------------------------------------------------
          ICC(2,1) |  0.6088  11.08     3.00   21.00   0.000    0.2105     0.9525
    ---------------+--------------------------------------------------------------
           sigma_s | 15.4766
           sigma_r |  0.0000 (replaced)
           sigma_e | 12.4072
    ------------------------------------------------------------------------------
    Note: F test and confidence intervals are based on methods for complete data.
    The methods and formulas that kappaetc implements are discussed in Gwet 2014, Ch. 7-10.

    Best
    Daniel

    Gwet, K. L. (2014). Handbook of Inter-Rater Reliability. Gaithersburg, MD: Advanced Analytics, LLC.
    Last edited by daniel klein; 17 Jan 2019, 03:47.

    Comment


    • #3
      Hi Daniel,

      Thank you for that, kappaetc is very useful.

      I have one follow up question:

      Code:
      reshape wide score , i(Vignette) j(ID)
      kappaetc score* , icc(random)


      I have added the code above into my .do file, however when I run the file twice the ICC returned is slightly different?

      For example
      1st run of do file ICC= 0.6039
      clear

      2nd run of do file ICC= 0.5896
      clear

      3rd run of do file ICC= 0.6201


      I did not change anything in my do file between runs.

      Can you think of a reason for this?

      Thanks,
      Olivia

      Comment


      • #4
        Olivia

        Thanks for reporting back. I cannot reproduce this problem with the example dataset. Do you have the latest version of kappaetc installed? The latest version is

        Code:
        . which kappaetc
        ...
        *! version 2.0.0 28jun2018 daniel klein
        Also, do you have repeated measures for vignettes and raters? That is, does the same rater score the same vignette repeatedly? If so, you should indicate the vignette identifier in the i() option.

        Best
        Daniel

        Comment


        • #5
          Hi Daniel,

          Thank you for your reply and apologies for my delayed response.

          After looking over my do file it seems the reason for this was further up in my code, when I was collapsing variables. Once I have fixed this I expect the ICC to be constant.

          Thanks again,
          Olivia

          Comment

          Working...
          X