Hi all,
I have panel data of individuals, their binary unemployment and their weight across 3 waves. I want to determine if the relationship between their binary employment and weight could reflect a secular increase in weight, thus I create a binary random treatment created from noise, reasoning that if I see an effect of this variable on weight, it's likely that the original unemployment effect on weight is not trustworthy.
I use the runiform() function in Stata to produce a random number for each person for each year provided they have employment data, as I want to compare these results to the results for people I previously studied an employment change in. If this random number is below 0.5 I give them a binary variable that is ==1 and if it is above 0.5 I give them a binary variable that is equal to 0, like a fake employed/unemployed binary variable.
I use the runiform() function as I understand it to be recursive, and thus replicable.
It's important to me that my results are reproducible, which is why I set the seed.
The first time I ran a regression with this random variable I got the following result:
When I run Stata, and repeat the above my results are the same, however if I close Stata, (i.e. clear and start again) I get completely different results the next time I run the regression as below, even though I use the same .do file to generate everything, including my random number.
Can anyone help me to understand why this is, and how to get the same results every time?
The next time I did this re-running the .do file after closing Stata and opening it again:
Thanks for any help,
John
I have panel data of individuals, their binary unemployment and their weight across 3 waves. I want to determine if the relationship between their binary employment and weight could reflect a secular increase in weight, thus I create a binary random treatment created from noise, reasoning that if I see an effect of this variable on weight, it's likely that the original unemployment effect on weight is not trustworthy.
I use the runiform() function in Stata to produce a random number for each person for each year provided they have employment data, as I want to compare these results to the results for people I previously studied an employment change in. If this random number is below 0.5 I give them a binary variable that is ==1 and if it is above 0.5 I give them a binary variable that is equal to 0, like a fake employed/unemployed binary variable.
I use the runiform() function as I understand it to be recursive, and thus replicable.
Code:
. capture drop draw_fathers . set seed 9000 . display c(seed) Xfed3371cc43f462544a474abacbdd93d00044448 . display runiform() .42625766 . gen draw_fathers = cond(runiform() < .5, 1, 0) if X_ADDFAunempusualsitpes_y!=. (7,241 missing values generated) . tab draw_fathers draw_father | s | Freq. Percent Cum. ------------+----------------------------------- 0 | 13,071 49.96 49.96 1 | 13,090 50.04 100.00 ------------+----------------------------------- Total | 26,161 100.00
It's important to me that my results are reproducible, which is why I set the seed.
The first time I ran a regression with this random variable I got the following result:
Code:
. xtreg ba_nogawho i.draw_fathers i.C_region_y i.year i.C_Simplemotherage_y i.C_Simplemothereduca_y i.C_mothermar_y if > atleast2weightmeasures == 1, cluster (id) fe Fixed-effects (within) regression Number of obs = 24421 Group variable: id Number of groups = 9159 R-sq: within = 0.0494 Obs per group: min = 1 between = 0.0000 avg = 2.7 overall = 0.0133 max = 3 F(12,9158) = 90.54 corr(u_i, Xb) = -0.0254 Prob > F = 0.0000 (Std. Err. adjusted for 9,159 clusters in id) ---------------------------------------------------------------------------------------------------- | Robust ba_nogawho | Coef. Std. Err. t P>|t| [95% Conf. Interval] -----------------------------------+---------------------------------------------------------------- 1.draw_fathers | .0104891 .0120381 0.87 0.384 -.0131081 .0340864 1.C_region_y | .0146547 .0318843 0.46 0.646 -.0478457 .077155 | year | 1 | .1867891 .0131795 14.17 0.000 .1609543 .2126239 2 | -.1578258 .0163961 -9.63 0.000 -.1899658 -.1256857 | C_Simplemotherage_y | 30-39 | .0210079 .0291105 0.72 0.471 -.0360551 .078071 40 or more | .0035692 .04051 0.09 0.930 -.0758395 .0829779 | C_Simplemothereduca_y | Leaving Certificate to Non Degree | .1249528 .0631912 1.98 0.048 .001084 .2488215 Primary Degree or greater | .1234926 .0711105 1.74 0.082 -.0158998 .2628851 | C_mothermar_y | 2 | .0916308 .1511258 0.61 0.544 -.2046096 .3878711 3 | .0401262 .1206813 0.33 0.740 -.196436 .2766883 4 | -.0247547 .0388708 -0.64 0.524 -.1009501 .0514408 5 | .3145843 .3789889 0.83 0.407 -.4283185 1.057487 | _cons | .5884448 .0659928 8.92 0.000 .4590842 .7178054 -----------------------------------+---------------------------------------------------------------- sigma_u | .88704999 sigma_e | .7514361 rho | .58220466 (fraction of variance due to u_i) ---------------------------------------------------------------------------------------------------- .
When I run Stata, and repeat the above my results are the same, however if I close Stata, (i.e. clear and start again) I get completely different results the next time I run the regression as below, even though I use the same .do file to generate everything, including my random number.
Can anyone help me to understand why this is, and how to get the same results every time?
The next time I did this re-running the .do file after closing Stata and opening it again:
Code:
capture drop draw_fathers set seed 9000 display c(seed) Xfed3371cc43f462544a474abacbdd93d00044448 display runiform() 42625766 gen draw_fathers = cond(runiform() < .5, 1, 0) if X_ADDFAunempusualsitpes_y!=. (7,241 missing values generated) tab draw_fathers draw_father | s | Freq. Percent Cum. ------------+----------------------------------- 0 | 13,071 49.96 49.96 1 | 13,090 50.04 100.00 ------------+----------------------------------- Total | 26,161 100.00 . xtreg ba_nogawho i.draw_fathers i.C_region_y i.year i.C_Simplemotherage_y i.C_Simplemothereduca_y i.C_mothermar_y if > atleast2weightmeasures == 1, cluster (id) fe Fixed-effects (within) regression Number of obs = 24421 Group variable: id Number of groups = 9159 R-sq: within = 0.0495 Obs per group: min = 1 between = 0.0000 avg = 2.7 overall = 0.0134 max = 3 F(12,9158) = 90.88 corr(u_i, Xb) = -0.0249 Prob > F = 0.0000 (Std. Err. adjusted for 9,159 clusters in id) ---------------------------------------------------------------------------------------------------- | Robust ba_nogawho | Coef. Std. Err. t P>|t| [95% Conf. Interval] -----------------------------------+---------------------------------------------------------------- 1.draw_fathers | -.0165708 .0121349 -1.37 0.172 -.040358 .0072164 1.C_region_y | .0142196 .0318937 0.45 0.656 -.0482993 .0767384 | year | 1 | .1869799 .0131823 14.18 0.000 .1611396 .2128202 2 | -.1576787 .0163978 -9.62 0.000 -.1898221 -.1255354 | C_Simplemotherage_y | 30-39 | .0210425 .0291073 0.72 0.470 -.0360143 .0780993 40 or more | .0035197 .0405133 0.09 0.931 -.0758953 .0829347 | C_Simplemothereduca_y | Leaving Certificate to Non Degree | .1251361 .0632171 1.98 0.048 .0012165 .2490558 Primary Degree or greater | .1247482 .0711284 1.75 0.079 -.0146793 .2641756 | C_mothermar_y | 2 | .0911564 .1514408 0.60 0.547 -.2057013 .3880141 3 | .0367664 .1207045 0.30 0.761 -.1998413 .2733741 4 | -.0247201 .038866 -0.64 0.525 -.1009061 .0514659 5 | .3015891 .3819644 0.79 0.430 -.4471464 1.050325 | _cons | .6014585 .0662502 9.08 0.000 .4715932 .7313237 -----------------------------------+---------------------------------------------------------------- sigma_u | .88697445 sigma_e | .75140868 rho | .58218098 (fraction of variance due to u_i) ---------------------------------------------------------------------------------------------------- .
Thanks for any help,
John
Comment