Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Inter-rater Reliability- help!

    Hi all,

    I am relatively new to both Stata and statistics in general. I am working on a research project investigating the inter-rater reliability between 3 different pathologists. So there are 3 raters per patient, which can give up to 15 different diagnoses. The data is set up so each of the 3 column heads is a different rater, with their diagnoses listed under it. Each row is represented by a different patient. My questions are:

    1. How would I go about doing this? I have been using the command "kap rater1 rater2 rater3" - is this correct or should I be using a different command for Fleiss kappa since there are greater than 2 raters?


    2. I have read a lot about weighted kappa, but since I am only looking at diagnoses (which are represented by a number), I don't need to use this right? To be more clear, someone either gets a diagnosis of lets say 1, while another pathologists gives them a diagnosis of 9. There is no scale of difference between the diagnoses- they either agree or don't.


    3. for confidence interval, can I just use "kapci rater1 rater2 rater3"?


    Thank you all for your help!
    Last edited by Nick Leader; 21 Dec 2016, 10:31.

  • #2
    1. -kap rater1 rater2 rater3- is correct for this problem.

    2. You are probably correct. It depends a bit on what the domain of diagnoses involved is. Conceivably one could treat, say, CIN3 and CIN4 as a partial agreement, closer than, say, CIN3 vs normalt. But unless a bunch of your diagnoses fall into scales like that, it would be best to characterize pairs of diagnoses as either agreeing or not. So probably no reason to use weights hee.

    3. I'm not familiar with -kapci- so I can't help you with that. It's a user written program from the Stata Journal. (The FAQ does request that you identify non-official Stata commands as such in your posts and indicate where they come from.) Its help file suggests that it will be suitable for your purpose, but I have no experience with it.

    Comment


    • #3
      Thank you so much, I appreciate it!!

      Comment


      • #4
        For confidence intervals (and also more than one specific agreement coefficient) you might be interested in kappaetc (from SSC) released just yesterday.

        Best
        Daniel

        Comment


        • #5
          Thank you Daniel, I will definitely download that

          Comment


          • #6
            kappaetc doesn't work for me since I have an older version of the program . Any other suggestions to figure out CI using kappa? As I run "kapci" it gives me slightly different confidence intervals every time. Upon further reading, I think this is do to the use of "bootstraps" (?) which I am unfamiliar with. This is the response Stata gives me every time I run "kapci rater1 rater2 rater3"- Note: default number of bootstrap replications has been set to 5 for syntax testing only.reps() needs to be increased when analysing real data.

            How/why do I go about increasing the number of "reps"?

            I obtained the code "kapci" by downloading it after discovering it on this site: http://www.stata-journal.com/sjpdf.h...iclenum=st0076

            Thanks in advance!

            Comment


            • #7
              To specify the number of reps, just add it to the command as an option:

              Code:
              kapci rater1 rater2 rater3, reps(#)
              where you replace # by the number of reps you want. The larger the number you choose, the more stable the result will be, but it will take longer to run. I'm not really sure what the sampling distribution of a kappa statistic looks like, but since it must be bounded, it is hard to imagine that anything more than 1000 would be needed to get a practical level of precision And 1000 kappa statistic calculations won't take very long at all. So I'd probably do reps(1000). Also, to make the calculations reproducible, you need to set the seed of the random number generator. If you already do that somewhere in your do-file before you get to -kapci-, then you don't need to do it again. If not, then I notice from the help file that you can do it with -kapci-'s -seed()- option. It doesn't matter what integer number you pick for the seed: different values will give you slightly different results, but you will get the same results each time you re-run your do-file with the same specification of -seed()-. So
              Code:
              kapci rater1 rater2 rater3, reps(1000) seed(1234)
              will probably serve you well.

              As for learning about bootstrapping, the manual section* on the -bootstrap- command (in [R]), particularly the subsection called Introduction, has a pretty straightforward explanation of what it is and how it works. You may or may not want to pursue it in greater depth than that: if so, the references cited there are quite comprehensive.

              *I'm looking at the manual section for Stata version 14. Since you are using an earlier version, the documentation may differ. I don't have an older version available now, and, in any case, you didn't actually say which older version you're using.

              Comment


              • #8
                Originally posted by Nick Leader View Post
                kappaetc doesn't work for me since I have an older version of the program .
                This means older than version 11.2, which is quite old already, and the minimum required for kappaetc (although I might get it to run with Stata 10.1). Note that stating the version of Stata that you use (if not the latest) is requested in the FAQs and is indeed very helpful for understanding your problems and giving useful advice.

                Best
                Daniel

                Comment


                • #9
                  Awesome explanation, thank you!!

                  Comment


                  • #10
                    Sorry I never posted the version. I am unfortunately stuck with Stata version 8

                    Comment


                    • #11
                      One last question...

                      How would I go about finding the percent agreement between the three raters with the same setup as described above? I can not seem to find this process anywhere other than through the use of apps on third-party sites, but I need to be able to perform the analysis in Stata. Thanks again in advance.

                      Comment


                      • #12
                        Well, you have three raters. So there are three different proportions (or percentages) of rater agreement: 1 vs 2, 1 vs 3, and 2 vs 3. You'll have to do those separately. When you run -kap- with just two raters, the agreement is part of the output (and is also stored in r(prop_o) if you need to store it in a macro).

                        Added: At least this is true in current Stata. I don't remember much back to version 8.2. But looking at the code in kappa.ado, it looks like it was written for version 6, so I assume this behavior is still the same.

                        Comment


                        • #13
                          Is there a way to create a "dummy" variable that's equal to 1 (for example) if all 3 agree with each other? Wouldn't that allow "tab dummyvariable" to give the percent agreement?

                          I guess my main question is, is there a way to get a percentage on if ALL 3 agree (vs not) out of the total number of patients (entries)?
                          Last edited by Nick Leader; 22 Dec 2016, 09:34.

                          Comment


                          • #14
                            Sure. That's easy.

                            Code:
                            gen byte all_agree = (rater1 == rater2) & (rater2 == rater3)
                            That won't give you proportions of agreement just among rater1 and rater2, for example. But if you only want to know how often all three agree, that's all there is to it.

                            Comment


                            • #15
                              Here is a code snippet to get the percent agreement that uses tuples (from SSC), which is supposed to work with Stata 8.

                              Code:
                              /// example dataset
                              webuse p615b , clear
                              keep rater1-rater3
                              
                              // get percent agreement
                              tempname prop_o
                              scalar `prop_o' = 0
                              tuples 1 2 3 , min(2) max(2)
                              forvalues j = 1/`ntuples' {
                                  tokenize `tuple`j''
                                  kap rater`1' rater`2'
                                  scalar `prop_o' = `prop_o' + r(prop_o)
                              }
                              display `prop_o'/`ntuples'
                              Note that what you seem to be looking for is not the equivalent to percent agreement.

                              Best
                              Daniel

                              Comment

                              Working...
                              X