Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • differences in differences estimator with various cross-sectional data

    In 2019, Mexico saw a significant increase in the minimum wage, especially in municipalities bordering the United States. There are several studies that indicate that this allowed for an increase in the lower part of the wage distribution. In Mexico, farmworkers are the lowest paid wage earners, but they appear to have benefited the most from this measure. My objective is to analyze whether the effect of the wage increase was higher among farmworkers than among the rest of wage earners. To do this, I use a difference-in-differences estimation, in this way

    [IMG]file:///C:/Users/hola/AppData/Local/Temp/msohtmlclip1/01/clip_image002.png[/IMG]

    I have 4 years corresponding to each survey of the national income and expenditure of Mexico (2016, 2018, 2020 and 2022).
    [IMG]file:///C:/Users/hola/AppData/Local/Temp/msohtmlclip1/01/clip_image004.png[/IMG] =1 if year=2020 or 2022 and zero otherwise
    [IMG]file:///C:/Users/hola/AppData/Local/Temp/msohtmlclip1/01/clip_image006.png[/IMG]=1 if a person is a farmworker and zero otherwise
    B3 is the difference in difference estimator
    Covid_year=1 if year=2020 (in this year Mexican GDP fell 8.0 %)
    AX It is a vector of characteristics of the person, as well as variables at the state level

    My question is whether this model is suitable for modeling

    reg y incmwage farmw incmw_fw covid_y age ysch marginal_index, vce(cluster states)



    Thanks in advance

  • #2
    I suppose you intended to include some image files containing equations, or the like, in your post. But they did not appear. So your post is left with a regression command whose variables are unexplained in your post, and some unrendered images that ostensibly contain the model equations the regression is supposed to represent.

    Please try re-posting. If you cannot get the equations posted as images, then it may be best to just re-type them (as best you can given the limited character set available here) in the editor. And be sure to explain the correspondence between variables in your -reg- command and the ones in your equations.

    Comment


    • #3
      Thank you Clyde, I am writing again

      In 2019, Mexico saw a significant increase in the minimum wage, especially in municipalities bordering the United States. There are several studies that indicate that this allowed for an increase in the lower part of the wage distribution. In Mexico, farmworkers are the lowest paid wage earners, but they appear to have benefited the most from this measure. My objective is to analyze whether the effect of the wage increase was higher among farmworkers than among the rest of wage earners. To do this, I use a difference-in-differences estimation, in this way

      yi=b0+b1*year_covid +b2*year_icminwage+b3*farmworker+ b4* year_icminwage*farmworker+ other variables +u



      I have 4 years corresponding to each survey of the national income and expenditure of Mexico (2016, 2018, 2020 and 2022).
      year_icmiwage= , variable that denote years of increase of minimum wage, take value of 1 if year=2020 or 2022 and zero otherwise
      farmworker, variable that take value of 1 if a person is a farmworker and zero otherwise
      b4 is the difference in difference estimator
      Covid_year=1 if year=2020 (in this year Mexican GDP fell 8.0 %) or zero otherwise. This variable denotes the negative effect in Mexican economy due COVID
      other variables It is a vector of characteristics of the person, as well as variables at the state level

      My question is whether this model is suitable for modeling

      reg y incmwage farmw incmw_fw covid_y age ysch marginal_index, vce(cluster states)



      Thanks in advance

      Comment


      • #4
        I guess that incmwage in the regression command corresponds to year_icminwage in the equation, farmw corresponds to farmworker, and incmw_fw corresponds to year_icminwage*farmworker, and covid_y corresponds to year_covid. I also imagine that ysch and marginal_index are the "other variables."

        I can't infer from their names what ysch and marginal_index actually represent, but caution is always needed when including covariates in this kind of model because, if poorly chosen, they can introduce endogeneity.

        Conditional on that caution, your regression command looks OK. But you can do better by using factor-variable notation, which will then enable you to more simply interpret the results by applying the -margins- command afterward:

        Code:
        regress y i.year_icminwage##i.farmworker i.covid_y ysch marginal_index, vce(cluster states)
        The i.icminwage##i.farmworker term expands to year_icminwage, i.farmworker, and their interaction--this is known as factor variable notation. If you are not familiar with it, you should read -help fvvarlist-. Only by using factor variable notation can you then use the -margins- command, which is very helpful for understanding the results of interaction models. The -margins- command is somewhat complicated, but is very clearly explained in the excellent Richard Williams' https://www3.nd.edu/~rwilliam/stats/Margins01.pdf, which includes examples that are similar to your regression.

        Comment


        • #5
          Thank you very much, I will apply these advices.

          Best

          Comment

          Working...
          X