differences in differences estimator with various cross-sectional data

Omar Stabridis

Join Date: Aug 2016

Posts: 8
#1

differences in differences estimator with various cross-sectional data

30 Dec 2024, 13:54

In 2019, Mexico saw a significant increase in the minimum wage, especially in municipalities bordering the United States. There are several studies that indicate that this allowed for an increase in the lower part of the wage distribution. In Mexico, farmworkers are the lowest paid wage earners, but they appear to have benefited the most from this measure. My objective is to analyze whether the effect of the wage increase was higher among farmworkers than among the rest of wage earners. To do this, I use a difference-in-differences estimation, in this way

[IMG]file:///C:/Users/hola/AppData/Local/Temp/msohtmlclip1/01/clip_image002.png[/IMG]

I have 4 years corresponding to each survey of the national income and expenditure of Mexico (2016, 2018, 2020 and 2022).
[IMG]file:///C:/Users/hola/AppData/Local/Temp/msohtmlclip1/01/clip_image004.png[/IMG] =1 if year=2020 or 2022 and zero otherwise
[IMG]file:///C:/Users/hola/AppData/Local/Temp/msohtmlclip1/01/clip_image006.png[/IMG]=1 if a person is a farmworker and zero otherwise
B3 is the difference in difference estimator
Covid_year=1 if year=2020 (in this year Mexican GDP fell 8.0 %)
AX It is a vector of characteristics of the person, as well as variables at the state level

My question is whether this model is suitable for modeling

reg y incmwage farmw incmw_fw covid_y age ysch marginal_index, vce(cluster states)

Thanks in advance
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#2

30 Dec 2024, 14:44

I suppose you intended to include some image files containing equations, or the like, in your post. But they did not appear. So your post is left with a regression command whose variables are unexplained in your post, and some unrendered images that ostensibly contain the model equations the regression is supposed to represent.

Please try re-posting. If you cannot get the equations posted as images, then it may be best to just re-type them (as best you can given the limited character set available here) in the editor. And be sure to explain the correspondence between variables in your -reg- command and the ones in your equations.
1 like
Comment
Omar Stabridis

Join Date: Aug 2016

Posts: 8
#3

31 Dec 2024, 06:22

Thank you Clyde, I am writing again

In 2019, Mexico saw a significant increase in the minimum wage, especially in municipalities bordering the United States. There are several studies that indicate that this allowed for an increase in the lower part of the wage distribution. In Mexico, farmworkers are the lowest paid wage earners, but they appear to have benefited the most from this measure. My objective is to analyze whether the effect of the wage increase was higher among farmworkers than among the rest of wage earners. To do this, I use a difference-in-differences estimation, in this way

yi=b0+b1*year_covid +b2*year_icminwage+b3*farmworker+ b4* year_icminwage*farmworker+ other variables +u

I have 4 years corresponding to each survey of the national income and expenditure of Mexico (2016, 2018, 2020 and 2022).
year_icmiwage= , variable that denote years of increase of minimum wage, take value of 1 if year=2020 or 2022 and zero otherwise
farmworker, variable that take value of 1 if a person is a farmworker and zero otherwise
b4 is the difference in difference estimator
Covid_year=1 if year=2020 (in this year Mexican GDP fell 8.0 %) or zero otherwise. This variable denotes the negative effect in Mexican economy due COVID
other variables It is a vector of characteristics of the person, as well as variables at the state level

My question is whether this model is suitable for modeling

reg y incmwage farmw incmw_fw covid_y age ysch marginal_index, vce(cluster states)

Thanks in advance
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#4

31 Dec 2024, 08:32

I guess that incmwage in the regression command corresponds to year_icminwage in the equation, farmw corresponds to farmworker, and incmw_fw corresponds to year_icminwage*farmworker, and covid_y corresponds to year_covid. I also imagine that ysch and marginal_index are the "other variables."

I can't infer from their names what ysch and marginal_index actually represent, but caution is always needed when including covariates in this kind of model because, if poorly chosen, they can introduce endogeneity.

Conditional on that caution, your regression command looks OK. But you can do better by using factor-variable notation, which will then enable you to more simply interpret the results by applying the -margins- command afterward:

Code:

regress y i.year_icminwage##i.farmworker i.covid_y ysch marginal_index, vce(cluster states)

The i.icminwage##i.farmworker term expands to year_icminwage, i.farmworker, and their interaction--this is known as factor variable notation. If you are not familiar with it, you should read -help fvvarlist-. Only by using factor variable notation can you then use the -margins- command, which is very helpful for understanding the results of interaction models. The -margins- command is somewhat complicated, but is very clearly explained in the excellent Richard Williams' https://www3.nd.edu/~rwilliam/stats/Margins01.pdf, which includes examples that are similar to your regression.
Comment
Omar Stabridis

Join Date: Aug 2016

Posts: 8
#5

31 Dec 2024, 10:45

Thank you very much, I will apply these advices.

Best
Comment

Announcement

differences in differences estimator with various cross-sectional data

Comment

Comment

Comment

Comment