Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sample size calculation for negative predictive value

    Dear Statalist

    I am involved in a project that compares two binary diagnostic tests (one used as a gold standard). I need to calculate the minimum required sample size for a given/expected negative predictive value when one test is compared against the other. I am unclear how to run this. Could you please help?

    Many thanks
    Georgia

  • #2
    A negative predictive value is just a proportion. So comparing two NPVs is just comparing two proportions. The main question is whether the two tests are being run on the same specimens/samples/patients/whatever, in which case they are paired proportions, or whether you have independent specimens/samples/patients/whatever for the two tests. I recommend doing this using the graphical user interface. Start by clicking on Statistics just above the toolbar in Stata. From the drop-down menu select Power, precision and sample-size analysis (you may have to scroll down for this). In the window that opens, in the left panel, expand "Proportions." If you have paired data, select two paired samples. If you have two independent samples, then select that. In the window that opens, select whichever kind of test you plan to use. A window with various boxes will appear and you should fill in the appropriate parameters for your data. At the bottom of the window, click on Submit.

    Added: I just noticed you said one of the tests is a "gold standard." In that case your "gold standard" result is not considered to have error variance--its results are "truth." So after expanding Proportions, select "one sample" and select the test you want. (You cannot calculate sample size for the binomial test, so don't select that.)
    Last edited by Clyde Schechter; 26 Jun 2023, 16:36.

    Comment


    • #3
      Dear Clyde, thank you very much for your quick reply.
      However, I am still a bit confused by the fact that NPV is defined through prevalence of the disease and specificity, and these two do not get to be included in the sample size calculation. Essentially, by using one sample score test comparing one proportion to a reference value, the return value will be how many people I need to recruit with a negative screening test result to achieve the pre-specified NPV against a ref value (and for a given significance level and power). That doesn't tell me how many people I need to recruit in total, given prevalence of the disease.

      Many thanks for your help

      Comment


      • #4
        Georgia,
        As Clyde mentioned, it is not clear if you are using a case-control design. Is your project a case–control study with subjects drawn from the population?

        If not (e.g., a cross-sectional study), what is wrong with Monte Carlo simulation with a focus on precision (and not on power)?

        Comment


        • #5
          However, I am still a bit confused by the fact that NPV is defined through prevalence of the disease and specificity
          Yes, that is one way of thinking about NPV. But it is simpler to think of it as just True Negatives/(True Negatives + False Negatives).

          by using one sample score test comparing one proportion to a reference value, the return value will be how many people I need to recruit with a negative screening test result to achieve the pre-specified NPV against a ref value (and for a given significance level and power).
          If you compare your non-gold-standard NPV value to a specified goal (Null) value using a one-sample power calculation, the result you get will be the total number of people you have to test to get your desired level of power for the desired level of Type 1 error. Now, this assumes that you are sampling from the same population for which the gold standard test was originally validated, or at least a population with the same prevalence. But, really, if you are sampling from a population with a different prevalence, equivalence of the NPV is, if not utterly meaningless, very difficult to interpret because NPV's are very sensitive to disease prevalence.

          Comment

          Working...
          X