Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference fixed effects and clustering

    Hello,

    I am currently writing a paper on the effects of sex ratio on males education in China.

    Years of education = B0 + B1*gender + B2*sex_ratio + B3*gender*sex_ratio + controls

    I have a question regarding fixed effects and clustering. I am running an OLS regression based on a one point in time survey, where I have data on demographic variables such as education & age as well as a sex ratio variable for each age group and province (e.g. for 16 year olds in Beijing). When talking with my tutor, she mentioned that i should include both fixed effects on a province level as well as cluster my standard errors at the province and age level ( i have made a variable that groups province and age). When reading about fixed effects, i am a little confused if I really should do that, as fixed effects seems to be mostly used for panel data (over time). What really is the difference between fixed effects on a province level and clustering at the province level?

    I understand that this is not a question directly related to stata, but would be super appreciated if any of you would be able to give me some insight on this!

    Best,
    Anna

  • #2
    When talking with my tutor, she mentioned that i should include both fixed effects on a province level as well as cluster my standard errors at the province and age level ( i have made a variable that groups province and age). When reading about fixed effects, i am a little confused if I really should do that, as fixed effects seems to be mostly used for panel data (over time).

    You make a good student, being critical of your tutor's sloppiness. You should refer to these as province dummies. See Jeff Wooldridge's post #4 on exactly your point: https://www.statalist.org/forums/for...uated-at-means.

    When talking with my tutor, she mentioned that i should include both fixed effects on a province level as well as cluster my standard errors at the province and age level ( i have made a variable that groups province and age).
    On clustering, your tutor probably wants you to specify two cluster variables. There are some community-contributed commands that will allow multiple clustering, e.g.,

    Code:
    ssc install reghdfe, replace
    Then the syntax is:

    Code:
    reghdfe edu_yrs i.gender##c.sex_ratio  controls, absorb(province) cluster(province age)
    Note that

    cluster(province age)
    is different from your interacted (grouped) variable

    cluster(provinceage)
    The -absorb(province)- option absorbs the province dummies, and is equivalent to:

    Code:
    reghdfe edu_yrs i.gender##c.sex_ratio i.province controls, noabsorb cluster(province age)
    Last edited by Andrew Musau; 14 Apr 2022, 06:16.

    Comment

    Working...
    X