Bootstrap collinearity problem (red x)

Summer Cao

Join Date: Jun 2017

Posts: 1
#1

Bootstrap collinearity problem (red x)

25 Oct 2024, 09:49

I ran bootstrapping for xtmixed model, but encountered red x and the message that "collinearity in replicate sample is not the same as the full sample, posting missing values".

bootstrap, reps(1000) seed(12) : xtmixed y x1 x2 x3 i.a i.b || d:

I found that some dummy variables of b has low frequencies (see below). When I remove i.b (the dummies below), the bootstrapping works. However, I need to control for the dummies b from a theoretical perspective. How can I make the bootstrapping work while take into account of the effects of b? I would greatly appreciate any suggestions. Thanks!

or | Freq. Percent Cum.
------------+-----------------------------------
10 | 4 0.67 0.67
13 | 18 3.02 3.69
14 | 3 0.50 4.19
15 | 3 0.50 4.69
20 | 5 0.84 5.53
21 | 2 0.34 5.86
23 | 2 0.34 6.20
24 | 1 0.17 6.37
25 | 2 0.34 6.70
26 | 7 1.17 7.87
27 | 4 0.67 8.54
28 | 69 11.56 20.10
29 | 2 0.34 20.44
30 | 1 0.17 20.60
31 | 1 0.17 20.77
34 | 1 0.17 20.94
35 | 33 5.53 26.47
36 | 55 9.21 35.68
37 | 9 1.51 37.19
38 | 42 7.04 44.22
39 | 1 0.17 44.39
42 | 3 0.50 44.89
45 | 2 0.34 45.23
46 | 1 0.17 45.39
47 | 2 0.34 45.73
48 | 16 2.68 48.41
49 | 8 1.34 49.75
50 | 5 0.84 50.59
51 | 6 1.01 51.59
52 | 1 0.17 51.76
53 | 3 0.50 52.26
54 | 5 0.84 53.10
55 | 1 0.17 53.27
56 | 1 0.17 53.43
57 | 1 0.17 53.60
58 | 2 0.34 53.94
59 | 10 1.68 55.61
60 | 126 21.11 76.72
61 | 2 0.34 77.05
62 | 8 1.34 78.39
63 | 8 1.34 79.73
65 | 3 0.50 80.23
67 | 10 1.68 81.91
70 | 1 0.17 82.08
72 | 2 0.34 82.41
73 | 81 13.57 95.98
75 | 1 0.17 96.15
78 | 1 0.17 96.31
79 | 2 0.34 96.65
80 | 9 1.51 98.16
87 | 9 1.51 99.66
99 | 2 0.34 100.00
Tags: bootstrap
Jean-Michel Galarneau

Join Date: Aug 2018

Posts: 39
#2

25 Oct 2024, 10:24

What version of Stata are you using?
xtmixed has been replaced by mixed.
perhaps try

Code:

bootstrap, reps(1000) seed(12) : mixed y x1 x2 x3 i.a i.b || d:,

if you are working in the world of multilevel, you probably want to bootstrap at the cluster level as shown below

Code:

bootstrap, reps(1000) seed(12) cluster(d) idcluster(i_d) : mixed y x1 x2 x3 i.a i.b || i_d:,

some red x's is ok, it does happen that models don't converge when resampled but difficult to say what is happening based on the information you give
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#3

25 Oct 2024, 10:59

If the advice in #2 does not solve your problem, and removing i.b from the model is theoretically inadmissible, you could also try combining several of the low-frequency levels of b into a single "Other" category. This should only be done, however, if in the context of your problem the meanings of those categories is such that you would not be creating an absurdly heterogeneous and self-contradictory level.

And if that approach is not feasible or does not solve your problem, then you need to get more data that includes more of the rarer values of b.

Finally, let me just emphasize what Jean-Michel Galarneau said: a small number of red x's is not a problem.
Comment

Announcement

Bootstrap collinearity problem (red x)

Comment

Comment