Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Confidence intervals for ICC in variance components models

    I am estimating the ICC for an outcome with data on several 10,000s nested within ~50 groups. The outcome of interest is binary with prevalence ~5%. I am using:
    Code:
    melogit y || group:
    estat icc
    From the model, the variance associated with the level 2 unit (-var(_cons)-) is low (<.001) with lb95 < 0.00001 and ub95 4.61e+11. This does not make sense, but the underlying data included in the model have been double and triple checked. Furthermore, the point estimate and SE of ICC is very low, with a CI ranging from <.00001 to 1. I wonder if anyone can offer an explanation or suggest an alternative strategy. I have tried to use the option - normal-:
    Code:
    melogit y || group:
    estat icc, normal
    The CI becomes way more narrow, but the lower bound is slightly negative, and imposing the normal option strikes me as a violation as the outcome is binary.
    Last edited by Tarjei W. Havneraas; 07 Oct 2021, 22:54.

  • #2
    It looks like your variance component is essentially zero, and I guess that within the precision limits of the computations used you can occasionally get pathological results. See the last successfull iteration below for an example. (I used 50 for "~50 groups" and 30000 for "several 10,000s".)

    .ÿ
    .ÿversionÿ17.0

    .ÿ
    .ÿclearÿ*

    .ÿ
    .ÿsetÿseedÿ`=strreverse("1630875")'

    .ÿ
    .ÿquietlyÿsetÿobsÿ50

    .ÿgenerateÿbyteÿgrpÿ=ÿ_n

    .ÿ
    .ÿquietlyÿexpandÿ`=floor(30000ÿ/ÿ50)'

    .ÿquietlyÿgenerateÿbyteÿoutÿ=ÿ.

    .ÿ
    .ÿtempnameÿBCIsÿNCIs

    .ÿ
    .ÿforvaluesÿrepÿ=ÿ1/10ÿ{
    ÿÿ2.ÿÿÿÿÿÿÿÿÿquietlyÿreplaceÿoutÿ=ÿrbinomial(1,ÿ0.5)
    ÿÿ3.ÿÿÿÿÿÿÿÿÿquietlyÿmelogitÿoutÿ||ÿgrp:ÿ,ÿiterate(15)
    ÿÿ4.ÿÿÿÿÿÿÿÿÿifÿe(converged)ÿ{
    ÿÿ5.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿquietlyÿestatÿicc
    ÿÿ6.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmatrixÿdefineÿ`BCIs'ÿ=ÿ(ÿnullmat(`BCIs')ÿ\ÿ(r(icc2),ÿr(ci2))ÿ)
    ÿÿ7.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿquietlyÿestatÿicc,ÿnormal
    ÿÿ8.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmatrixÿdefineÿ`NCIs'ÿ=ÿ(ÿnullmat(`NCIs')ÿ\ÿ(r(icc2),ÿr(ci2))ÿ)
    ÿÿ9.ÿÿÿÿÿÿÿÿÿ}
    ÿ10.ÿ}

    .ÿ
    .ÿmatrixÿlistÿ`BCIs'

    __000000[5,3]
    ÿÿÿÿÿÿÿÿÿÿÿc1ÿÿÿÿÿÿÿÿÿllÿÿÿÿÿÿÿÿÿul
    r1ÿÿ.00055524ÿÿÿ.0000896ÿÿÿ.0034325
    r2ÿÿ.00065263ÿÿ.00013039ÿÿ.00325972
    r3ÿÿ.00022571ÿÿ4.501e-06ÿÿ.01119766
    r4ÿÿ.00010495ÿÿ3.639e-08ÿÿÿ.2324173
    r5ÿÿ.00002071ÿÿ3.021e-22ÿÿÿÿÿÿÿÿÿÿ1

    .ÿmatrixÿlistÿ`NCIs'

    __000001[5,3]
    ÿÿÿÿÿÿÿÿÿÿÿÿc1ÿÿÿÿÿÿÿÿÿÿllÿÿÿÿÿÿÿÿÿÿul
    r1ÿÿÿ.00055524ÿÿ-.00045725ÿÿÿ.00156772
    r2ÿÿÿ.00065263ÿÿ-.00039807ÿÿÿ.00170333
    r3ÿÿÿ.00022571ÿÿÿ-.0006578ÿÿÿ.00110922
    r4ÿÿÿ.00010495ÿÿ-.00073115ÿÿÿ.00094106
    r5ÿÿÿ.00002071ÿÿ-.00078225ÿÿÿ.00082368

    .ÿ
    .ÿexit

    endÿofÿdo-file


    .


    I didn't see the -normal- option in the help file or user's manual entry for the command; nevertheless, I think that the issue that you're facing with it is not that the outcome is binary in the fitted model, but rather that the intraclass correlation coefficient is supposed to be bounded by zero and one, and with the -normal- option you can get negative values for the lower confidence bound (see above). You'd probably want to truncate the lower bound at zero if you decide to go that route. If it were me, I'd just say that the ICC is essentially zero and leave it at that (do not report confidence bounds or say that their computation is degenerate).

    Comment


    • #3
      Thanks for your response, Joseph. Reassuring to see that this may be within expected values. I agree that it may be best to just report the ICC as essentially zero. The -normal- option is undocumented and intended for internal testing at StataCorp according to this post: https://www.stata.com/statalist/arch.../msg01195.html.

      Comment


      • #4
        As another option, you could also bootstrap the ICC to get CIs.
        Best wishes

        (Stata 16.1 MP)

        Comment


        • #5
          Agreed, in principle, but I think that the problem with bootstrapping in this particular case (ICC basically zero) is that Tarjei would get failed convergence with too many bootstrap samples to make the effort worthwhile. See my results above: convergence was attained within 15 iterations, which ought to have been adequate for a reasonably well behaved likelihood, only about 50% of the time.

          I think that the best bet when you have an infinitesimally small ICC (variance component) is to just say so and skip the fancy stuff.

          Comment


          • #6
            I think that the best bet when you have an infinitesimally small ICC (variance component) is to just say so and skip the fancy stuff.
            Agree, and would add that in those circumstances it is often most sensible to remove that level from the model and re-run without it.

            Comment

            Working...
            X