Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with Fuzzy RDD: Perfect Prediction Issue in First-Stage Regression

    Hi all,

    I'm conducting a fuzzy regression discontinuity design (RDD) in Stata to examine the impact of exposure to non constant fee on household registration in village farm input sharing group. My running variable is one of the family child age, with a cutoff at 16. The treatment indicator is whether a household is exposed to non fixed fees (1 if age is 16 and above, 0 otherwise). My outcome variable is registration in the farming group. registration requirement is that all families with children below 16 will pay a constant/fixed membership fee quarterly, and any family with at least one child one of the child age 16 or above needs to increase the fee by about 10% of the constant for every child above this age. In my dataset, I don't observe who is paying the constant and non constant fee. So, I create it based on child age. Since, there is a chance of still paying constant fee despite having a child just above 16 due to misreporting, and also because not all families exposed to constant fee are compilers, my setting is fuzzy RDD.
    registration - Outcome
    age=child_age-16 - running variable
    Z (=1, if child age<=16, 0=otherwise) - exposure indicator
    I run
    Code:
     rdrobust registration age,  fuzzy(Z) covs(X1 X2 X3)  masspoints(adjust) all weights(wt) vce(cluster vilID) bwselect(mserd) c(0) p(1) kernel(triangular)
    In the first-stage regression, I'm encountering a perfect prediction issue, where the treatment indicator is perfectly predicted by the running variable. The coefficient estimates are all 1, and the standard errors are extremely small. Here are the first-stage results:
    Code:
     First-stage estimates. Outcome: Z. Running variable: age.
    
    Method          Coef.          Std. Err.     z                P>z         [95% Conf. Interval]
    
    Conventional    1              7.6e-17       1.3e+16          0.000       1            1
    Bias-corrected  1              7.6e-17       1.3e+16          0.000       1            1
    Robust          1              1.8e-16       5.4e+15          0.000       1            1
    Is this a reliability issue? How should I address this perfect prediction problem in my fuzzy RDD? Are there alternative approaches or specifications I should consider?

    Thank you for your help!
    Last edited by Tariku Getaneh; 26 May 2024, 10:14.

  • #2
    I'm not exactly clear about what you are up to, but if this is administrative data, you'd be pretty easy to catch if you gave the true age and then tried to avoid the fee by registering inappropriately.

    I think you're looking for liars, and that won't be easy in administrative data (except for a few really dumb people).

    Comment


    • #3
      Thanks George. Let me clarify. This is a household level survey data. In that, I have family members age and their registration status (1 if registered, 0 if not). I know that if a family has a child member age 16 or above, their membership fee is greater than those of families with without. From the data, I constructed a dummy variable (D) that equals 1 for families with at least one child aged 16 or above, 0 for families with all children below 16. I want to compare registration outcome between these groups. reg registration D Xi, clu(village) but then I thought that I can use the oldest child age cutoff at 16 to do a fuzzy RDD, and the previous first stage results appear. Is this is not a plausible way to go? what about doing an IV: ivregress registration (D=oldest_child_age) Xi, clu(village)

      Comment

      Working...
      X