Which techniques suitable for my Data

Khawar Afreen

Join Date: Feb 2024

Posts: 20
#1

Which techniques suitable for my Data

20 Feb 2024, 07:53

I am trying to find the impact of education reforms on fertility. I have DHS survey Data for 2017-18 micro-level data. Total number of observation is 12364. 3982 observations were taken in 2017 Aged between 15 - 49 and 8382 observations were taken in 2018 Aged between 15 - 49. I have five regions. In two regions, the reforms were implemented in 2014, in the third region it was implemented in 2013, in the fourth region, it was implemented in 2012 and in the fourth region, it was implemented in 2017. The education variable is a categorical variable like no education, Primary education, Secondary education, and higher education. The fertility variable is the total number of children ever born, Region is a categorical variable like region one to five, age is a discrete variable, wealth index is also a categorical variable like poorer, middle, richer, and richest and place of residence also a categorical variable like rural and urban. which technique will be better to apply?
Tags: None
George Ford

Join Date: Aug 2014

Posts: 3152
#2

20 Feb 2024, 08:07

Ideally, you could get data before the reforms. Otherwise, you're limited to cross sectional analysis, which in this case I'd think would render biased results. You might think of a "time since policy change" variable, but I'm not sure that's very interesting.

My advice is to get more data before the reforms, then look at CSDID or Mundlak Regression for a DD estimator.
Comment
Khawar Afreen

Join Date: Feb 2024

Posts: 20
#3

21 Feb 2024, 07:49

But in this cross-section data, we already have very little observation influenced by the reforms. If we take more data from the previous wave of DHS surveys which is not influenced by reforms would it be better?
Comment
George Ford

Join Date: Aug 2014

Posts: 3152
#4

21 Feb 2024, 09:10

Absolutely. You must address selection bias, and the before data permits that. Look of difference-in-differences. That's what you need to be doing.
Comment
Khawar Afreen

Join Date: Feb 2024

Posts: 20
#5

22 Feb 2024, 05:54

My professor told me to focus on regression discontinuity design techniques individually in each region only on 2017-18 data. It will be a better option?
Comment
Khawar Afreen

Join Date: Feb 2024

Posts: 20
#6

22 Feb 2024, 06:09

If I apply difference-in-differences, I append the previous wave of DHS surveys then the previous observation will be 0 because they are not influenced and in 2017-18 all observations will not be 1. The only influence observations will be 1 and others that are not influenced will be 0 ???? will it be ID variable ? then how can I handle the time variable to reform implementation? because each region has a different time implementation period.
Comment
Khawar Afreen

Join Date: Feb 2024

Posts: 20
#7

23 Feb 2024, 02:43

Also in difference in difference, If I do it separately for each region, I think, the time variable and treatment variable will be the same. Kindly correct me If I am wrong.
Comment
George Ford

Join Date: Aug 2014

Posts: 3152
#8

23 Feb 2024, 08:01

With DID, you'll have a treatment dummy (any unit that receives the treatment = 1 for all time periods), and you'll have a treatment period dummy (often called post) which = 1 during the treatment period, 0 otherwise.

reg y c.treat#c.post treat post

where c.treat#c.post is your DID estimate.

you can also aborb IDs and time

rehgdfe y c.treat#c.post , absorb(id time)
Comment
Khawar Afreen

Join Date: Feb 2024

Posts: 20
#9

24 Feb 2024, 12:35

It would be better to use didregress command?
Comment
Khawar Afreen

Join Date: Feb 2024

Posts: 20
#10

24 Feb 2024, 12:37

or xth didregress?
Comment
George Ford

Join Date: Aug 2014

Posts: 3152
#11

24 Feb 2024, 15:18

didregress or xtdidregress should work.
Comment
Khawar Afreen

Join Date: Feb 2024

Posts: 20
#12

25 Feb 2024, 06:09

I have appended two waves of data as per your suggestion. Now I am confused. It would be repeated cross-section or panel because didregress command for repeated cross-section and xtdidregress for a panel then which will be used? I think both are not possible. I am a beginner in stata, so I am facing this problem.
Comment
George Ford

Join Date: Aug 2014

Posts: 3152
#13

25 Feb 2024, 13:10

I suspect xtdidregress requires you to xtset your data, which may not be possible in a repeated Xsection.

I think you should just use reghdfe and skip the canned programs.
Comment
Khawar Afreen

Join Date: Feb 2024

Posts: 20
#14

27 Feb 2024, 08:59

I am not familiar with reghdfe in Difference in difference. Kindly explain this process step by step. or please provide the complete details
Comment
Dirk Enzmann

Join Date: Apr 2014

Posts: 537
#15

27 Feb 2024, 17:23

If you want to know more about reghdfe (from SSc) you certainly have to invest your time to read its help (help reghdfe) -- it points to many additional sources to help your understanding, e.g. http://scorreia.com/software/reghdfe/
Comment

Announcement

Which techniques suitable for my Data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment