Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpretation of DID Coefficient for Gender Gap in Education

    I am analyzing the gender gap in education (measured by the girl's years of education minus the boy's years of education) as my outcome variable. My independent variable of interest is a policy intervention, and I am using a Difference-in-Differences (DID) model. How should I interpret the DID coefficient if it is positive in this context? Does a positive coefficient imply that the policy is narrowing the education gap between girls and boys? or it means girls are surpassing boys?

  • #2
    Does a positive coefficient imply that the policy is narrowing the education gap between girls and boys? or it means girls are surpassing boys?
    It depends on how you set up the variables and did the analysis. Please post back with an explanation of these variables, specifically, what values they take on for boys and girls, and for those that did and did not undergo the policy intervention. Also then show the DID regression output.

    Comment


    • #3
      The value for boys and girls education ranges from 0-21 years. The DID variation comes from states that implemented the policy vs. those that did not, and from women young enough (0-25 years) to be affected by the reform. The control group consists of those who were already married in implementing states (age ≥ 25) and all ages in non-implementing states. How should I interpret a positive DID coefficient in this setup?
      Click image for larger version

Name:	result.png
Views:	1
Size:	32.6 KB
ID:	1764697

      Comment


      • #4
        I'm sorry, but I don't understand what you did here.

        What is the variable oland? What does it mean when it is yes and what does it mean when it is no?

        What is the variable treated? Is it an individual level attribute referring to actually being affected by the intervention, or is it a state-level attribute? And in either case, does 1 mean that the intervention was in place or does it mean that it wasn't?

        I am puzzled by your definition of the control group. Why do you require women to be married in the implementing states but you don't require it in the non-implementing states? This seems likely to invalidate the study altogether as marriage itself may well be associated with education. In fact, I'm pretty sure it is the case that age at marriage is higher in women with greater education presently in the United States.

        Added: I also don't understand the gender gap outcome variable. As your sample size is 4,862, it is clear that your unit of analysis is not the state but something smaller. Perhaps the individual, or perhaps some intermediate units such as a county or municipality. Of course, it cannot actually be the individual because a single individual cannot have a gender gap. So what is your unit of analysis? And how do you calculate the gender gap in that unit of analysis since, by your description, there are no boys in the data?
        Last edited by Clyde Schechter; 29 Sep 2024, 21:42.

        Comment


        • #5
          So, we have a more complicating factor here. You now cannot simply use DID here, even. There's an assignment rule for the treatment you must account for. If women <= 25 are affected ONLY, and women >25 are not, you now have a regression discontinuity setup (assuming I'm understanding this data structure well where you see individuals nested within states).

          But... this is complicated by Clyde's remarks at the end of #4.


          So, what you need to do is use dataex to show is how your data ACTUALLY look. Verbal descriptions will do you little good, we need to see how the data looks on your machine, and only by dataex can we see that for ourselves.

          Comment


          • #6
            I apologise in advance for sharing data as image. So here gender gap is girl_child_edu- boy_child_edu. The variables e10 (most treated), e1115(partially treated), e1620(least treated) are the age cohorts like 0-10 years, 11-15 years, 16-20 years of age at the time of the reform. Reform_state takes value 1 if the state had the reform and 0 otherwise. Owns lands is a variable used shows the ownership of land. So actually I am applying triple difference here.

            Added:The data was collected in 2004-05, where we analyze the effect of the reform on the educational gender gap between daughters and sons. In the analysis, the son and daughter refer to children of the household head, and their ages at the time of the survey are taken to be 22 years or older, ensuring they would have likely completed their education. In post #3 when I said treated it meant treated refers to individuals exposed to the reform, meaning those who were ≤25 years old in reform states at the time the policy was enacted. A value of 1 means they were eligible to benefit from the reform, while 0 means they were not. For the control group, I separate women by age cohort for the exposure analysis: the control group includes girls who were old enough to be married (as reform affected unmarried girls) at the time of reform in the reform states (age ≥25 at the time of the reform), and everyone in the non-reform states (regardless of age). This allows me to capture the effect of the reform on those young enough to be impacted.
            Click image for larger version

Name:	dta.png
Views:	1
Size:	56.3 KB
ID:	1764701

            Last edited by Shreya Jain; 29 Sep 2024, 23:19.

            Comment


            • #7
              I don't get why you insist on using images when we and the FAQ explicitly beseech you not to do that, but fine. Why did you not mention you were applying a triple difference? Why is there a need for one?

              And... how can cells with both girl and boy missing data also have numeric values for gender gap? Take row 25 and 26 where gender gap is not missing but girl and boy are missing. Or row 28. Like I guess I'm just confused by this.

              and while we're at it, how are we constructing the panel? What is your unit of analysis and what is your time period? Anyways, I still think you're overcomplicating this. To me, if this were my problem, I'd see we have a pre-post policy of (i guess survey) data over time where some states are affected and others not. Then I'd say "Okay, well not all individuals in those states are affected, there's a rule that determines if they were affected or not", in this case age.

              so boom, we have ourselves a difference in discontinuities approach. I would simply compare single ladies under 25 to single ladies over 25, with the discontinuity/running variable being 25 years of age. We then would have the pre-post angle of things, which is handled by the DID method. And if you like, you can control for owning land (which you don't say why this matters, but it seems right for now). Then, assuming there's no confounding at the discontinuity, assuming no manipulation into treatment selection, and assuming parallel pre-intervention trends relative to single ladies over 25 in the pre policy period, we have ourselves a locally identified ATT, I believe.

              but, all this hinges on the other details about why land ownership matters and other specific details

              Comment


              • #8
                I still think your control group is constructed in such a way as to invalidate the analysis, whether you do difference in discontinuity or anything else.

                A control group is supposed to consist of people who could have been affected by the policy but weren't. If only unmarried girls and women could have been affected by the policy, then married girls and women should not be included in the control group. You might be able to get away with violating this principle if education and marital status were actually independent of each other. But, in most societies, more educated women marry later, so the married population is going to differ in women's education levels from the unmarried.

                Comment


                • #9
                  Sir Clyde Schechter, in the literature on inheritance law reforms the researchers have used such treated and control cohort group.

                  The treatment group comprises those exposed to the reform based on their age at the time of the reform in reform states.
                  The control group is those who were too old at the time of the reform to be affected, and everyone in non-reform states.
                  And applying a Triple Difference-in-Differences (Triple DiD) approach comes from using these three variations:
                  • The first difference is between reform states and non-reform states.
                  • The second difference is between different age cohorts (as described below).
                    • Treated cohort:
                      • Most Treated (0-10 years old at the time of reform): These are daughters who were 0-10 years old when the reform was enacted in their state. They are expected to benefit the most because their educational decisions were influenced significantly by the reform.
                      • Partially Treated (11-15 years old): Daughters who were 11-15 years old when the reform occurred. They are somewhat treated, as part of their education would have occurred after the reform.
                      • Least Treated (16-20 years old): Daughters aged 16-20 at the time of the reform, who received less exposure to the reform's benefits as much of their education was likely completed before the reform.
                    • Control Group cohort:
                      • >21 years old: Daughters who were 21 years or older at the time of the reform. These women are considered the control group as their educational decisions were likely unaffected by the reform, having been completed before it.
                  • The third difference is based on land ownership.

                  Comment


                  • #10
                    ...in the literature on inheritance law reforms the researchers have used such treated and control cohort group.
                    There are many things one can find in the published literature that are just wrong. This may be one of them.


                    The treatment group comprises those exposed to the reform based on their age at the time of the reform in reform states.
                    The control group is those who were too old at the time of the reform to be affected, and everyone in non-reform states.
                    And applying a Triple Difference-in-Differences (Triple DiD) approach comes from using these three variations:
                    • The first difference is between reform states and non-reform states.
                    • The second difference is between different age cohorts (as described below).
                      • Treated cohort:
                        • Most Treated (0-10 years old at the time of reform): These are daughters who were 0-10 years old when the reform was enacted in their state. They are expected to benefit the most because their educational decisions were influenced significantly by the reform.
                        • Partially Treated (11-15 years old): Daughters who were 11-15 years old when the reform occurred. They are somewhat treated, as part of their education would have occurred after the reform.
                        • Least Treated (16-20 years old): Daughters aged 16-20 at the time of the reform, who received less exposure to the reform's benefits as much of their education was likely completed before the reform.
                      • Control Group cohort:
                        • >21 years old: Daughters who were 21 years or older at the time of the reform. These women are considered the control group as their educational decisions were likely unaffected by the reform, having been completed before it.
                    • The third difference is based on land ownership.
                    I can't tell if this describes your own study or if you are referring to a published study from the literature on inheritance law reforms. It does not sound much like your own study as originally described in #3 in that what you say here makes no mention of marital status, and what you said in #3 does not refer to multiple age groups. Suffice it to say that educational decisions are typically associated with both age and marital status. Your proposed triple differences model incorporates age indirectly (through levels of treatment), but given that there isn't even any overlap in age between the treated and control groups, this is not an adequate way to deal with the confounding of age and the reform in your data design. On top of that, your design does not deal with the marital status difference at all. So I remain strongly skeptical of this approach.

                    Comment


                    • #11
                      Clyde Schechter Thank you for your reply. I will look further into this.
                      but I do have a question for you...I’m analyzing the impact of inheritance law reforms that were implemented in different states (5 States) at varying time periods. The non-reform states, however, were never treated. Since my dataset is cross-sectional, I use young (e.g., 0-10 years, 11-15 years, 16-20 years) and old cohorts (e.g., 21+ years) to define the pre- and post-reform periods for the DID analysis, with reform states as treated and non-reform states as control.

                      The challenge arises in the non-reform states: because there was no reform, we cannot define similar age cohorts as in the reform states based on a reform date. As a result, all age groups in the non-reform states are treated as a control. This means there’s no directly comparable young cohort (e.g., 0-10 years, 11-15 years, 16-20 years) in the non-reform states.

                      Given this setup, can this still be considered a valid DID structure, even without comparable young cohorts in the control (non-reform) states?


                      * I also have third variation in terms of landownership but for time being for my own understanding not bringing that in here.

                      Comment


                      • #12
                        If all of the reform states implemented their reforms at the same time, then the simple solution to this problem is to create the age groups in the non-reform states based on age as of the implementation date in the reform states.

                        If, as I suspect is far more likely, different reform states implemented their reforms at different times, then the above solution is not possible. However, you can approximate it by pairing each non-reform state with one reform state and using the paired reform state's implementation date as the base date for calculating age groups in that non-reform state. There are a few different ways to pair the non-reform states with reform states. One is based on similarity: look at attributes of the state that are relevant to the outcome(s) you are studying and try to match on those. Another approach is a purely random matching if there are no relevant variables to match on.

                        Comment

                        Working...
                        X