Fixed Effects With Cross Sectional Data - Inconsistent Standard Errors

Owen Wallbanks

Join Date: Jan 2022
Posts: 22

Fixed Effects With Cross Sectional Data - Inconsistent Standard Errors

07 Feb 2022, 12:56

Dear all

I have a cross-sectional dataset organised in three levels

Individuals, nested within...
Nuclear families (their parents and siblings), nested within...
Extended families (their uncle/aunt and first cousins)

I'm trying to control for extended family fixed effects using the within estimation method, rather than dummy variable estimation method. While both methods yield consistent coefficients, the standard errors are smaller when using the within method. Is there any built-in commands within Stata that correct for this?

If the explanation is not clear launching the following sample data:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float chmarried11 byte totchildren11 float(chsex6 cheduc5 efamid nfamid)
1 3 0  .  1  1
0 3 1  .  1  1
0 3 1  .  1  1
1 2 1 18  2  2
1 2 0 18  2  2
0 2 0 12  3  3
1 2 1 13  3  3
0 3 1  .  4  4
1 3 1  .  4  5
1 3 0  .  4  4
1 3 1  .  4  5
1 3 0  .  4  5
1 3 0  .  4  4
1 3 1 13  5  7
1 3 0 14  5  7
1 9 0 16  5  6
1 9 0 16  5  6
1 9 1 16  5  6
1 9 1 16  5  6
1 9 1 16  5  6
1 9 0 16  5  6
1 9 0 13  5  6
1 3 1 16  5  7
1 9 0 16  5  6
1 9 0 16  5  6
1 2 0 16  6  8
0 2 1 16  6  8
1 2 1 16  7  9
1 2 0 12  7  9
1 4 0 12  8 10
1 4 0 12  8 10
1 4 0 12  8 10
0 4 1 12  8 10
1 3 1 16  9 11
0 3 1 16  9 11
1 3 1 16  9 11
0 2 0 16 10 12
0 2 1 35 10 12
0 3 1 13 11 13
1 3 0 14 11 13
0 3 0 16 11 13
1 3 1  . 12 15
1 3 0  . 12 15
0 2 1  . 12 14
0 2 0  . 12 14
1 3 1  . 12 15
0 3 0  . 13 16
1 3 0  . 13 16
0 3 1  . 13 16
1 2 0 16 14 18
end
label values totchildren11 QQ2011NKIDS
label values chsex6 chsex6
label def chsex6 0 "Female", modify
label def chsex6 1 "Male", modify
label values cheduc5 QQHIGHESTGRADE
label def QQHIGHESTGRADE 12 "12 years of school -- high school graduate", modify
label def QQHIGHESTGRADE 13 "1 year of college", modify
label def QQHIGHESTGRADE 14 "2 years of college", modify
label def QQHIGHESTGRADE 16 "4 years of college -- Bachelor's degree", modify
label def QQHIGHESTGRADE 18 "6 years college, 2 years grad or prof school", modify
label def QQHIGHESTGRADE 35 "Attended or attends special school", modify

And running:

Code:

bysort efamid: egen meanchmarried11 = mean(chmarried11)
bysort efamid: egen meantotchildren11 = mean(totchildren11)
bysort efamid: egen meanchsex6 = mean(chsex6)
bysort efamid: egen meancheduc5 = mean(cheduc5)
gen marriedd = chmarried11 - meanchmarried11
gen totchildrend = totchildren11 - meantotchildren11
gen chsexd = chsex6 - meanchsex6
gen cheducd = cheduc5 - meancheduc5
reg chmarried11 totchildren11 i.chsex6 cheduc5 i.efamid
reg marriedd totchildrend chsexd cheducd

Should illustrate the issue.

Many thanks
Owen

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17709
#2

07 Feb 2022, 15:12

Owen:
if you actually have a nested design and a continuous regressand, why not considering -mixed-?

Kind regards,
Carlo
(Stata 19.0)
Comment
Owen Wallbanks

Join Date: Jan 2022

Posts: 22
#3

08 Feb 2022, 01:39

Dear Carlo

Thank you for your response. I have been using -melogit-. However, from the posts I have seen -mixed- / -melogit- commands don't seem to be easily compatible with IV regression, which is a key part of my methodology. The combination of fixed effects to remove unobserved heterogeneity at the extended family level and IV for family size at the nuclear family level is something which I hope will improve the internal validity of my results compared to previous literature. I know the dummy approach will work but is obviously very computationally intensive given ~10,300 extended families.

I also need to cluster standard errors at the nuclear family level which adds another level of complexity to the standard error problem.

Thanks again
Owen

Last edited by Owen Wallbanks; 08 Feb 2022, 02:20.
Comment
Owen Wallbanks

Join Date: Jan 2022

Posts: 22
#4

08 Feb 2022, 06:21

Update: Still not sure whether there is a way within Stata to do this but you can do it by hand as follows:

report variance=true variance*( true DoF /report DoF) and se=sqrt(variance).

Best
Owen
Comment

Announcement

Fixed Effects With Cross Sectional Data - Inconsistent Standard Errors

Comment

Comment

Comment