Why can't I reproduce results in my fixed effects logit using runiform() when setting the seed?

John Adler

Join Date: Apr 2017
Posts: 173

Why can't I reproduce results in my fixed effects logit using runiform() when setting the seed?

09 Apr 2020, 13:27

Hi all,

I have panel data of individuals, their binary unemployment and their weight across 3 waves. I want to determine if the relationship between their binary employment and weight could reflect a secular increase in weight, thus I create a binary random treatment created from noise, reasoning that if I see an effect of this variable on weight, it's likely that the original unemployment effect on weight is not trustworthy.

I use the runiform() function in Stata to produce a random number for each person for each year provided they have employment data, as I want to compare these results to the results for people I previously studied an employment change in. If this random number is below 0.5 I give them a binary variable that is ==1 and if it is above 0.5 I give them a binary variable that is equal to 0, like a fake employed/unemployed binary variable.

I use the runiform() function as I understand it to be recursive, and thus replicable.

Code:

. capture drop draw_fathers

. set seed 9000

. display c(seed)
Xfed3371cc43f462544a474abacbdd93d00044448

. display runiform()
.42625766
 
. gen draw_fathers = cond(runiform() < .5, 1, 0) if X_ADDFAunempusualsitpes_y!=.
(7,241 missing values generated)

. tab draw_fathers

draw_father |
          s |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |     13,071       49.96       49.96
          1 |     13,090       50.04      100.00
------------+-----------------------------------
      Total |     26,161      100.00

It's important to me that my results are reproducible, which is why I set the seed.

The first time I ran a regression with this random variable I got the following result:

Code:


. xtreg ba_nogawho  i.draw_fathers i.C_region_y i.year i.C_Simplemotherage_y i.C_Simplemothereduca_y i.C_mothermar_y   if
>  atleast2weightmeasures == 1, cluster (id) fe 

Fixed-effects (within) regression               Number of obs      =     24421
Group variable: id                              Number of groups   =      9159

R-sq:  within  = 0.0494                         Obs per group: min =         1
       between = 0.0000                                        avg =       2.7
       overall = 0.0133                                        max =         3

                                                F(12,9158)         =     90.54
corr(u_i, Xb)  = -0.0254                        Prob > F           =    0.0000

                                                       (Std. Err. adjusted for 9,159 clusters in id)
----------------------------------------------------------------------------------------------------
                                   |               Robust
                        ba_nogawho |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-----------------------------------+----------------------------------------------------------------
                    1.draw_fathers |   .0104891   .0120381     0.87   0.384    -.0131081    .0340864
                      1.C_region_y |   .0146547   .0318843     0.46   0.646    -.0478457     .077155
                                   |
                              year |
                                1  |   .1867891   .0131795    14.17   0.000     .1609543    .2126239
                                2  |  -.1578258   .0163961    -9.63   0.000    -.1899658   -.1256857
                                   |
               C_Simplemotherage_y |
                            30-39  |   .0210079   .0291105     0.72   0.471    -.0360551     .078071
                       40 or more  |   .0035692     .04051     0.09   0.930    -.0758395    .0829779
                                   |
             C_Simplemothereduca_y |
Leaving Certificate to Non Degree  |   .1249528   .0631912     1.98   0.048      .001084    .2488215
        Primary Degree or greater  |   .1234926   .0711105     1.74   0.082    -.0158998    .2628851
                                   |
                     C_mothermar_y |
                                2  |   .0916308   .1511258     0.61   0.544    -.2046096    .3878711
                                3  |   .0401262   .1206813     0.33   0.740     -.196436    .2766883
                                4  |  -.0247547   .0388708    -0.64   0.524    -.1009501    .0514408
                                5  |   .3145843   .3789889     0.83   0.407    -.4283185    1.057487
                                   |
                             _cons |   .5884448   .0659928     8.92   0.000     .4590842    .7178054
-----------------------------------+----------------------------------------------------------------
                           sigma_u |  .88704999
                           sigma_e |   .7514361
                               rho |  .58220466   (fraction of variance due to u_i)
----------------------------------------------------------------------------------------------------

.

When I run Stata, and repeat the above my results are the same, however if I close Stata, (i.e. clear and start again) I get completely different results the next time I run the regression as below, even though I use the same .do file to generate everything, including my random number.

Can anyone help me to understand why this is, and how to get the same results every time?

The next time I did this re-running the .do file after closing Stata and opening it again:

Code:


capture drop draw_fathers

set seed 9000

display c(seed)
Xfed3371cc43f462544a474abacbdd93d00044448

display runiform()
42625766

gen draw_fathers = cond(runiform() < .5, 1, 0) if X_ADDFAunempusualsitpes_y!=.
(7,241 missing values generated)

tab draw_fathers

draw_father |
          s |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |     13,071       49.96       49.96
          1 |     13,090       50.04      100.00
------------+-----------------------------------
      Total |     26,161      100.00





. xtreg ba_nogawho  i.draw_fathers i.C_region_y i.year i.C_Simplemotherage_y i.C_Simplemothereduca_y i.C_mothermar_y   if
>  atleast2weightmeasures == 1, cluster (id) fe 

Fixed-effects (within) regression               Number of obs      =     24421
Group variable: id                              Number of groups   =      9159

R-sq:  within  = 0.0495                         Obs per group: min =         1
       between = 0.0000                                        avg =       2.7
       overall = 0.0134                                        max =         3

                                                F(12,9158)         =     90.88
corr(u_i, Xb)  = -0.0249                        Prob > F           =    0.0000

                                                       (Std. Err. adjusted for 9,159 clusters in id)
----------------------------------------------------------------------------------------------------
                                   |               Robust
                        ba_nogawho |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-----------------------------------+----------------------------------------------------------------
                    1.draw_fathers |  -.0165708   .0121349    -1.37   0.172     -.040358    .0072164
                      1.C_region_y |   .0142196   .0318937     0.45   0.656    -.0482993    .0767384
                                   |
                              year |
                                1  |   .1869799   .0131823    14.18   0.000     .1611396    .2128202
                                2  |  -.1576787   .0163978    -9.62   0.000    -.1898221   -.1255354
                                   |
               C_Simplemotherage_y |
                            30-39  |   .0210425   .0291073     0.72   0.470    -.0360143    .0780993
                       40 or more  |   .0035197   .0405133     0.09   0.931    -.0758953    .0829347
                                   |
             C_Simplemothereduca_y |
Leaving Certificate to Non Degree  |   .1251361   .0632171     1.98   0.048     .0012165    .2490558
        Primary Degree or greater  |   .1247482   .0711284     1.75   0.079    -.0146793    .2641756
                                   |
                     C_mothermar_y |
                                2  |   .0911564   .1514408     0.60   0.547    -.2057013    .3880141
                                3  |   .0367664   .1207045     0.30   0.761    -.1998413    .2733741
                                4  |  -.0247201    .038866    -0.64   0.525    -.1009061    .0514659
                                5  |   .3015891   .3819644     0.79   0.430    -.4471464    1.050325
                                   |
                             _cons |   .6014585   .0662502     9.08   0.000     .4715932    .7313237
-----------------------------------+----------------------------------------------------------------
                           sigma_u |  .88697445
                           sigma_e |  .75140868
                               rho |  .58218098   (fraction of variance due to u_i)
----------------------------------------------------------------------------------------------------

.

Thanks for any help,

John

Tags: fixed effects, panel data, regression, syntax

William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

09 Apr 2020, 19:44

Previously posted, and subsequently responded to, at

https://www.statalist.org/forums/for...tting-the-seed

and subsquently posted, and responded to, at

https://www.statalist.org/forums/for...tting-the-seed

Last edited by William Lisowski; 09 Apr 2020, 19:47.
Comment

Joerg Luedicke (StataCorp)

StataCorp Employee

Join Date: Apr 2014
Posts: 113

09 Apr 2020, 20:13

Hi John,

I am unable to reproduce the behavior you describe. For example, if I run the following I get the same results every time in a new Stata instance:

Code:

. webuse nlswork
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. xtset idcode
       panel variable:  idcode (unbalanced)

. set seed 123

. gen W = runiform() > 0.5

. xtreg ln_w i.W grade age c.age#c.age ttl_exp, fe vce(cluster idcode)
note: grade omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =     28,508
Group variable: idcode                          Number of groups  =      4,708

R-sq:                                           Obs per group:
     within  = 0.1509                                         min =          1
     between = 0.2599                                         avg =        6.1
     overall = 0.1903                                         max =         15

                                                F(4,4707)         =     414.31
corr(u_i, Xb)  = 0.1230                         Prob > F          =     0.0000

                             (Std. Err. adjusted for 4,708 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         1.W |   .0076297   .0038219     2.00   0.046      .000137    .0151223
       grade |          0  (omitted)
         age |    .045451   .0041274    11.01   0.000     .0373594    .0535426
             |
 c.age#c.age |  -.0008983   .0000713   -12.59   0.000    -.0010382   -.0007584
             |
     ttl_exp |   .0430538   .0018481    23.30   0.000     .0394308    .0466769
       _cons |   .8816534   .0600092    14.69   0.000     .7640073    .9992996
-------------+----------------------------------------------------------------
     sigma_u |   .3726793
     sigma_e |  .29522158
         rho |  .61443283   (fraction of variance due to u_i)
------------------------------------------------------------------------------

Perhaps try to post a reproducible example that others can run and which includes the full sequence of commands from loading the dataset to fitting the model. With that said, your approach of using a binary variable based on random noise strikes me as rather nonsensical. Since this variable is completely random, it's "true" effect will be zero, and if you are using the infamous p<0.05 to decide whether this variable has an effect or not, then you will find an effect exactly 5% of the time if you would do this exercise repeatedly and using a different random draw each time. In other words, all you do to decide whether your unemployment effect on weight is trustworthy is to flip a heavily biased coin such that the chance of you finding the unemployment effect trustworthy is 95%.

Best,
Joerg

Comment

John Adler

Join Date: Apr 2017

Posts: 173
#4

10 Apr 2020, 11:48

Dear all,

Thank you for your feedback, I apologise for posting several times, which was not my intent, although I did notice some odd behaviour when I hit post (Statalist told me there was an "empty response").

For simplicity sake I will respond in this thread, and quote the excellent feedback I received at:

https://www.statalist.org/forums/for...tting-the-seed

https://www.statalist.org/forums/for...tting-the-seed

Thank you @Maarten Buis and @William Lisowski for your advice on sortseed which answers the query I had.

Joerg Luedicke, my dataset occurs during a period of secular weight increases in the western world and I felt that if I could repeat my main fixed effects logit analysis with a totally random noise dummy variable instead of the employment dummy, it could alleviate some concern a reader might have that the individuals who experienced job loss and a change in weight would have just experienced this weight change regardless of job loss.

But I take on board your feedback that my approach using a binary variable based on random noise may be misguided, could you suggest a better approach to this?

Kindest regards,

John
Comment

Announcement

Why can't I reproduce results in my fixed effects logit using runiform() when setting the seed?

Comment

Comment

Comment