  • Can’t estimate AME of colinear variable used in non-colinear interaction terms

    I am analyzing a dataset on removal proceedings adjudicated by the U.S. immigration court system. One of my regressions seeks to analyze the AME of attorney representation by immigrant race and criminal status while controlling for attorney fixed-effects. This regression includes the following variables:
    1. removal_decision_narrow: Binary outcome variable that indicates whether immigrant was removed from the U.S.
    2. atty: Binary variable that indicates if immigrant has attorney representation.
    3. race_mode_n: Four level factor variable that records immigrant race.
    4. any_crim: Binary variable that indicates if immigrant faces criminal charges.
    5. eoirattorneyid_n: Factor variable that uniquely identifies attorneys and also includes a level for proceedings without attorney representation.
    Below, I have included the output for the regression along with a subsequent margins command:
        xtreg removal_decision_narrow i.atty##i.race_mode_n##i.any_crim, ///
            fe i(eoirattorneyid_n )
    note: 1.atty omitted because of collinearity.
    Fixed-effects (within) regression               Number of obs     =  1,893,200
    Group variable: eoirattorn~n                    Number of groups  =     68,845
    R-squared:                                      Obs per group:
         Within  = 0.0141                                         min =          1
         Between = 0.0601                                         avg =       27.5
         Overall = 0.1507                                         max =  1,144,993
                                                    F(14, 1824341)    =    1869.20
    corr(u_i, Xb) = 0.4209                          Prob > F          =     0.0000
        removal_decision_narrow | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                         1.atty |          0  (omitted)
                    race_mode_n |
                      Hispanic  |   .1267481   .0020098    63.06   0.000     .1228089    .1306873
                         Black  |  -.0661567   .0028687   -23.06   0.000    -.0717793    -.060534
                         Asian  |  -.1074136   .0031916   -33.66   0.000    -.1136689   -.1011582
               atty#race_mode_n |
                    1#Hispanic  |  -.0177988   .0027394    -6.50   0.000     -.023168   -.0124296
                       1#Black  |   .0928683   .0036202    25.65   0.000     .0857729    .0999638
                       1#Asian  |   .1451019   .0038774    37.42   0.000     .1375024    .1527014
                       any_crim |
               Criminal Charge  |   .0700234   .0029179    24.00   0.000     .0643044    .0757423
                  atty#any_crim |
             1#Criminal Charge  |  -.0174224   .0041853    -4.16   0.000    -.0256254   -.0092193
           race_mode_n#any_crim |
      Hispanic#Criminal Charge  |  -.0754032   .0029848   -25.26   0.000    -.0812532   -.0695531
         Black#Criminal Charge  |   .0862672    .003962    21.77   0.000     .0785018    .0940327
         Asian#Criminal Charge  |   .1095746   .0043593    25.14   0.000     .1010305    .1181188
      atty#race_mode_n#any_crim |
    1#Hispanic#Criminal Charge  |   .0042629   .0043916     0.97   0.332    -.0043444    .0128702
       1#Black#Criminal Charge  |  -.0792305   .0055923   -14.17   0.000    -.0901912   -.0682697
       1#Asian#Criminal Charge  |  -.0811804    .006133   -13.24   0.000    -.0932008   -.0691599
                          _cons |   .6876161   .0013552   507.41   0.000       .68496    .6902721
                        sigma_u |  .40028643
                        sigma_e |  .29176294
                            rho |  .65305102   (fraction of variance due to u_i)
    F test that all u_i=0: F(68844, 1824341) = 9.87              Prob > F = 0.0000
        margins, dydx(atty) over(any_crim race_mode_n) 
    Average marginal effects                             Number of obs = 1,893,200
    Model VCE: Conventional
    Expression: Linear prediction, predict()
    dy/dx wrt:  1.atty
    Over:       any_crim race_mode_n
                                 |            Delta-method
                                 |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
    0.atty                       |  (base outcome)
    1.atty                       |
            any_crim#race_mode_n |
       No Criminal Charge#White  |          .  (not estimable)
    No Criminal Charge#Hispanic  |          .  (not estimable)
       No Criminal Charge#Black  |          .  (not estimable)
       No Criminal Charge#Asian  |          .  (not estimable)
          Criminal Charge#White  |          .  (not estimable)
       Criminal Charge#Hispanic  |          .  (not estimable)
          Criminal Charge#Black  |          .  (not estimable)
          Criminal Charge#Asian  |          .  (not estimable)
    Note: dy/dx for factor levels is the discrete change from the base level.
    I understand that it is impossible to estimate a coefficient for atty since this variable is perfectly colinear with the set of fixed effects produced by eoirattorneyid_n. By contrast, I think it is possible to estimate coefficients for interaction terms that include atty so long as there is intra-attorney variation in the other variables. For example, you cannot know the value of the interaction term atty#race_mode_n by knowing the value of eoirattorneyid_n since the same attorney can represent immigrants of different races.

    Given this understanding, why does the command margins, dydx(atty) over(any_crim race_mode_n) fail to estimate the AME of attorney representation by race and criminal status when the previous regression command successfully estimated coefficients for interaction terms of atty, any_crim, and race_mode_n? How can I calculate the AME of attorney representation by immigrant race and criminal status while controlling for attorney fixed-effects?

  #2
    I'm not even sure of the structure of the data. It seems it should be at the individual level. Do you see each individual over different time periods? Maybe you see each individual just once, and that's the reason you are using attorney fixed effects. From what I can tell, many attorneys in the data set

    If I have the setup correct, then the problem may like in the fact that atty = 1 always for an observation in your data set. This would be the case if eoirattorneyid_n is missing for individuals without attorney representation.

    Naturally, to estimate the effect of atty, you need some individuals in the estimation sample who do not have representation. If eoirattorneyid_n is missing for those who don't have representation, you need to give it some unique value.


    #3
      The dataset is unique at the proceeding level. The same attorney can represent different immigrants at different proceedings at different points in time. I am trying to control for characteristics that are specific to particular attorneys and which affect proceeding outcomes. There is strong evidence that certain attorneys are more likely to represent immigrants belonging to a particular race or with certain legal backgrounds. These attorneys might be systematically different in their ability to secure positive legal outcomes for immigrants. I want to control for these individual level differences in attorney quality.

      If atty equals 0, the variable eoirattorneyid_nequals the level 'no atty'. If atty equals 1, the variable eoirattorneyid_nequals a level that corresponds to a particular attorney. There are no instances where atty equals 0 and eoirattorneyid_n is missing.


      #4
        Ah, yes. If there were a problem, some interactions would be dropped.

        Try adding "noestimcheck" to the end of the margins command.


        #5
          I ran the margins command with the 'noestimcheck' option and got the following results:
          .                         margins, dydx(atty) over(any_crim race_mode_n) post noestimcheck
          Average marginal effects                             Number of obs = 1,882,209
          Model VCE: Conventional
          Expression: Linear prediction, predict()
          dy/dx wrt:  1.atty
          Over:       any_crim race_mode_n
                                       |            Delta-method
                                       |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
          0.atty                       |  (base outcome)
          1.atty                       |
                  any_crim#race_mode_n |
             No Criminal Charge#White  |          0  (omitted)
          No Criminal Charge#Hispanic  |   .0236401   .0026327     8.98   0.000       .01848    .0288002
             No Criminal Charge#Black  |   .0559942   .0034635    16.17   0.000     .0492058    .0627825
             No Criminal Charge#Asian  |   .0527248     .00372    14.17   0.000     .0454337     .060016
                Criminal Charge#White  |  -.0572188   .0039911   -14.34   0.000    -.0650411   -.0493964
             Criminal Charge#Hispanic  |  -.0323008   .0027941   -11.56   0.000    -.0377772   -.0268244
                Criminal Charge#Black  |  -.0270664   .0036987    -7.32   0.000    -.0343157   -.0198171
                Criminal Charge#Asian  |    .010452   .0041895     2.49   0.013     .0022408    .0186632
          Note: dy/dx for factor levels is the discrete change from the base level.
          Unfortunately, this command omits the AME of attorney representation for White immigrants without criminal charges.


          #6
            If those groups are exhaustive and mutually exclusive, then you lose one category -- for the same reason atty drops out. The other estimates are relative to the No Criminal Charge -- White group. So you can test differences with this base group. But the negative signs seem weird to me.


            #7
              Is there any way to answer my question by using linear combinations? I tried running the code below but the suest command gives and error that my equations are estimated with a nonstandard vce (delta). I can't combine my estimates to calculate differences.
                  xtreg relief_granted_narrow i.race_mode_n##i.any_crim ///
                      caseload_ind_14d_med charges_in_proc had_hearing i.lang_simp  ///
                      i.completion_year muslim_maj ///
                      i.judge_document_number_n  i.custody_n, fe i(eoirattorneyid_n)
                          est sto fig11b_removal
                           margins if eoirattorneyid_n == 84636, over(race_mode_n any_crim) post // If no attorney
                          est store     no_atty_fig11b_removal
                          est rest fig11b_removal
                          margins if eoirattorneyid_n != 84636, over(race_mode_n any_crim) post // If there is attorney
                          est store     atty_fig11b_removal
                          suest no_atty_fig11b_removal atty_fig11b_removal

