Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weighted Panel Data Set

    Dear Stata Users,

    I have 3 waves of a panel dataset which tracks the same 3,969 households from 333 different rural areas in a country. However, in the first wave the data only tracked these rural households, whereas in the second and third waves it also tracked an additional 1,486 households from 100 urban areas. In order to be representative of the national-level population a sample weight with post-stratification adjustments was calculated for the households and this weight variable is included in all the datasets. However, I am only looking to research the rural areas so I am looking to exclude the urban areas added to the final waves.

    I am looking to run a fixed effects regression using xtreg, fe and I want to include time fixed effects:

    xtset household_id_numeric Year

    Panel variable: household_id_numeric (unbalanced)
    Time variable: Year, 2011 to 2015, but with gaps
    Delta: 1 unit

    xtdescribe

    household_id_numeric: 1, 2, ..., 3969 n = 3969
    Year: 2011, 2013, ..., 2015 T = 3
    Delta(Year) = 1 unit
    Span(Year) = 5 periods
    (household_id_numeric*Year uniquely identifies each observation)

    Distribution of T_i: min 5% 25% 50% 75% 95% max
    1 2 3 3 3 3 3

    Freq. Percent Cum. | Pattern*
    ---------------------------+----------
    3639 91.69 91.69 | 111
    137 3.45 95.14 | 11.
    133 3.35 98.49 | 1..
    60 1.51 100.00 | 1.1
    ---------------------------+----------
    3969 100.00 | XXX
    --------------------------------------
    *Each column represents 2 periods.

    I am still learning stata, so there are a few questions that I had regarding the use of stata for researching my model:

    1. Regarding the weighting of data, as the urban areas are included in waves 2 and 3, the weighting given to the rural households is different to the weighting given to the same rural households in the first wave. I understand that in a fixed effects regression in stata the weights have to be consistent across the panels. I was trying to understand how best to include this weighting and whether I was best to use just the final two waves which take into account the weighting of urban areas despite only wanting to focus on the rural households. Alternatively, as the weightings in wave 1 are based only on the rural areas, would it be appropriate to apply these weightings to the two subsequent waves?

    2. My data set despite following the same households throughout the waves is unbalanced as they were unable to accurately survey all the households in every wave. Are there any issue I need to consider when using an unbalanced panel dataset?

    Thanks in advance for your help.
    Last edited by Samuel Welch; 10 Apr 2024, 04:27.
Working...
X