Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bootstrap collinearity problem (red x)

    I ran bootstrapping for xtmixed model, but encountered red x and the message that "collinearity in replicate sample is not the same as the full sample, posting missing values".

    bootstrap, reps(1000) seed(12) : xtmixed y x1 x2 x3 i.a i.b || d:

    I found that some dummy variables of b has low frequencies (see below). When I remove i.b (the dummies below), the bootstrapping works. However, I need to control for the dummies b from a theoretical perspective. How can I make the bootstrapping work while take into account of the effects of b? I would greatly appreciate any suggestions. Thanks!

    or | Freq. Percent Cum.
    ------------+-----------------------------------
    10 | 4 0.67 0.67
    13 | 18 3.02 3.69
    14 | 3 0.50 4.19
    15 | 3 0.50 4.69
    20 | 5 0.84 5.53
    21 | 2 0.34 5.86
    23 | 2 0.34 6.20
    24 | 1 0.17 6.37
    25 | 2 0.34 6.70
    26 | 7 1.17 7.87
    27 | 4 0.67 8.54
    28 | 69 11.56 20.10
    29 | 2 0.34 20.44
    30 | 1 0.17 20.60
    31 | 1 0.17 20.77
    34 | 1 0.17 20.94
    35 | 33 5.53 26.47
    36 | 55 9.21 35.68
    37 | 9 1.51 37.19
    38 | 42 7.04 44.22
    39 | 1 0.17 44.39
    42 | 3 0.50 44.89
    45 | 2 0.34 45.23
    46 | 1 0.17 45.39
    47 | 2 0.34 45.73
    48 | 16 2.68 48.41
    49 | 8 1.34 49.75
    50 | 5 0.84 50.59
    51 | 6 1.01 51.59
    52 | 1 0.17 51.76
    53 | 3 0.50 52.26
    54 | 5 0.84 53.10
    55 | 1 0.17 53.27
    56 | 1 0.17 53.43
    57 | 1 0.17 53.60
    58 | 2 0.34 53.94
    59 | 10 1.68 55.61
    60 | 126 21.11 76.72
    61 | 2 0.34 77.05
    62 | 8 1.34 78.39
    63 | 8 1.34 79.73
    65 | 3 0.50 80.23
    67 | 10 1.68 81.91
    70 | 1 0.17 82.08
    72 | 2 0.34 82.41
    73 | 81 13.57 95.98
    75 | 1 0.17 96.15
    78 | 1 0.17 96.31
    79 | 2 0.34 96.65
    80 | 9 1.51 98.16
    87 | 9 1.51 99.66
    99 | 2 0.34 100.00

  • #2
    What version of Stata are you using?
    xtmixed has been replaced by mixed.
    perhaps try
    Code:
     bootstrap, reps(1000) seed(12) : mixed y x1 x2 x3 i.a i.b || d:,
    if you are working in the world of multilevel, you probably want to bootstrap at the cluster level as shown below

    Code:
     bootstrap, reps(1000) seed(12) cluster(d) idcluster(i_d) : mixed y x1 x2 x3 i.a i.b || i_d:,
    some red x's is ok, it does happen that models don't converge when resampled but difficult to say what is happening based on the information you give

    Comment


    • #3
      If the advice in #2 does not solve your problem, and removing i.b from the model is theoretically inadmissible, you could also try combining several of the low-frequency levels of b into a single "Other" category. This should only be done, however, if in the context of your problem the meanings of those categories is such that you would not be creating an absurdly heterogeneous and self-contradictory level.

      And if that approach is not feasible or does not solve your problem, then you need to get more data that includes more of the rarer values of b.

      Finally, let me just emphasize what Jean-Michel Galarneau said: a small number of red x's is not a problem.

      Comment

      Working...
      X