Can’t estimate AME of colinear variable used in non-colinear interaction terms

Weston Ley

Join Date: Sep 2022
Posts: 19

Can’t estimate AME of colinear variable used in non-colinear interaction terms

11 Dec 2024, 11:15

I am analyzing a dataset on removal proceedings adjudicated by the U.S. immigration court system. One of my regressions seeks to analyze the AME of attorney representation by immigrant race and criminal status while controlling for attorney fixed-effects. This regression includes the following variables:

1. removal_decision_narrow: Binary outcome variable that indicates whether immigrant was removed from the U.S.
2. atty: Binary variable that indicates if immigrant has attorney representation.
3. race_mode_n: Four level factor variable that records immigrant race.
4. any_crim: Binary variable that indicates if immigrant faces criminal charges.
5. eoirattorneyid_n: Factor variable that uniquely identifies attorneys and also includes a level for proceedings without attorney representation.

Below, I have included the output for the regression along with a subsequent margins command:

Code:

    xtreg removal_decision_narrow i.atty##i.race_mode_n##i.any_crim, ///
        fe i(eoirattorneyid_n )

note: 1.atty omitted because of collinearity.

Fixed-effects (within) regression               Number of obs     =  1,893,200
Group variable: eoirattorn~n                    Number of groups  =     68,845

R-squared:                                      Obs per group:
     Within  = 0.0141                                         min =          1
     Between = 0.0601                                         avg =       27.5
     Overall = 0.1507                                         max =  1,144,993

                                                F(14, 1824341)    =    1869.20
corr(u_i, Xb) = 0.4209                          Prob > F          =     0.0000

---------------------------------------------------------------------------------------------
    removal_decision_narrow | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
----------------------------+----------------------------------------------------------------
                     1.atty |          0  (omitted)
                            |
                race_mode_n |
                  Hispanic  |   .1267481   .0020098    63.06   0.000     .1228089    .1306873
                     Black  |  -.0661567   .0028687   -23.06   0.000    -.0717793    -.060534
                     Asian  |  -.1074136   .0031916   -33.66   0.000    -.1136689   -.1011582
                            |
           atty#race_mode_n |
                1#Hispanic  |  -.0177988   .0027394    -6.50   0.000     -.023168   -.0124296
                   1#Black  |   .0928683   .0036202    25.65   0.000     .0857729    .0999638
                   1#Asian  |   .1451019   .0038774    37.42   0.000     .1375024    .1527014
                            |
                   any_crim |
           Criminal Charge  |   .0700234   .0029179    24.00   0.000     .0643044    .0757423
                            |
              atty#any_crim |
         1#Criminal Charge  |  -.0174224   .0041853    -4.16   0.000    -.0256254   -.0092193
                            |
       race_mode_n#any_crim |
  Hispanic#Criminal Charge  |  -.0754032   .0029848   -25.26   0.000    -.0812532   -.0695531
     Black#Criminal Charge  |   .0862672    .003962    21.77   0.000     .0785018    .0940327
     Asian#Criminal Charge  |   .1095746   .0043593    25.14   0.000     .1010305    .1181188
                            |
  atty#race_mode_n#any_crim |
1#Hispanic#Criminal Charge  |   .0042629   .0043916     0.97   0.332    -.0043444    .0128702
   1#Black#Criminal Charge  |  -.0792305   .0055923   -14.17   0.000    -.0901912   -.0682697
   1#Asian#Criminal Charge  |  -.0811804    .006133   -13.24   0.000    -.0932008   -.0691599
                            |
                      _cons |   .6876161   .0013552   507.41   0.000       .68496    .6902721
----------------------------+----------------------------------------------------------------
                    sigma_u |  .40028643
                    sigma_e |  .29176294
                        rho |  .65305102   (fraction of variance due to u_i)
---------------------------------------------------------------------------------------------
F test that all u_i=0: F(68844, 1824341) = 9.87              Prob > F = 0.0000


    margins, dydx(atty) over(any_crim race_mode_n) 

Average marginal effects                             Number of obs = 1,893,200
Model VCE: Conventional

Expression: Linear prediction, predict()
dy/dx wrt:  1.atty
Over:       any_crim race_mode_n

----------------------------------------------------------------------------------------------
                             |            Delta-method
                             |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
-----------------------------+----------------------------------------------------------------
0.atty                       |  (base outcome)
-----------------------------+----------------------------------------------------------------
1.atty                       |
        any_crim#race_mode_n |
   No Criminal Charge#White  |          .  (not estimable)
No Criminal Charge#Hispanic  |          .  (not estimable)
   No Criminal Charge#Black  |          .  (not estimable)
   No Criminal Charge#Asian  |          .  (not estimable)
      Criminal Charge#White  |          .  (not estimable)
   Criminal Charge#Hispanic  |          .  (not estimable)
      Criminal Charge#Black  |          .  (not estimable)
      Criminal Charge#Asian  |          .  (not estimable)
----------------------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

I understand that it is impossible to estimate a coefficient for atty since this variable is perfectly colinear with the set of fixed effects produced by eoirattorneyid_n. By contrast, I think it is possible to estimate coefficients for interaction terms that include atty so long as there is intra-attorney variation in the other variables. For example, you cannot know the value of the interaction term atty#race_mode_n by knowing the value of eoirattorneyid_n since the same attorney can represent immigrants of different races.

Given this understanding, why does the command margins, dydx(atty) over(any_crim race_mode_n) fail to estimate the AME of attorney representation by race and criminal status when the previous regression command successfully estimated coefficients for interaction terms of atty, any_crim, and race_mode_n? How can I calculate the AME of attorney representation by immigrant race and criminal status while controlling for attorney fixed-effects?

Tags: None

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2078
#2

11 Dec 2024, 13:01

I'm not even sure of the structure of the data. It seems it should be at the individual level. Do you see each individual over different time periods? Maybe you see each individual just once, and that's the reason you are using attorney fixed effects. From what I can tell, many attorneys in the data set

If I have the setup correct, then the problem may like in the fact that atty = 1 always for an observation in your data set. This would be the case if eoirattorneyid_n is missing for individuals without attorney representation.

Naturally, to estimate the effect of atty, you need some individuals in the estimation sample who do not have representation. If eoirattorneyid_n is missing for those who don't have representation, you need to give it some unique value.
Comment
Weston Ley

Join Date: Sep 2022

Posts: 19
#3

11 Dec 2024, 13:42

Originally posted by Jeff Wooldridge View Post

I'm not even sure of the structure of the data. It seems it should be at the individual level. Do you see each individual over different time periods? Maybe you see each individual just once, and that's the reason you are using attorney fixed effects. From what I can tell, many attorneys in the data set

If I have the setup correct, then the problem may like in the fact that atty = 1 always for an observation in your data set. This would be the case if eoirattorneyid_n is missing for individuals without attorney representation.

Naturally, to estimate the effect of atty, you need some individuals in the estimation sample who do not have representation. If eoirattorneyid_n is missing for those who don't have representation, you need to give it some unique value.

The dataset is unique at the proceeding level. The same attorney can represent different immigrants at different proceedings at different points in time. I am trying to control for characteristics that are specific to particular attorneys and which affect proceeding outcomes. There is strong evidence that certain attorneys are more likely to represent immigrants belonging to a particular race or with certain legal backgrounds. These attorneys might be systematically different in their ability to secure positive legal outcomes for immigrants. I want to control for these individual level differences in attorney quality.

If atty equals 0, the variable eoirattorneyid_nequals the level 'no atty'. If atty equals 1, the variable eoirattorneyid_nequals a level that corresponds to a particular attorney. There are no instances where atty equals 0 and eoirattorneyid_n is missing.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2078
#4

11 Dec 2024, 14:00

Ah, yes. If there were a problem, some interactions would be dropped.

Try adding "noestimcheck" to the end of the margins command.
Comment

Weston Ley

Join Date: Sep 2022
Posts: 19

11 Dec 2024, 15:41

Originally posted by Jeff Wooldridge View Post

Ah, yes. If there were a problem, some interactions would be dropped.

Try adding "noestimcheck" to the end of the margins command.

I ran the margins command with the 'noestimcheck' option and got the following results:

Code:

.                         margins, dydx(atty) over(any_crim race_mode_n) post noestimcheck

Average marginal effects                             Number of obs = 1,882,209
Model VCE: Conventional

Expression: Linear prediction, predict()
dy/dx wrt:  1.atty
Over:       any_crim race_mode_n

----------------------------------------------------------------------------------------------
                             |            Delta-method
                             |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
-----------------------------+----------------------------------------------------------------
0.atty                       |  (base outcome)
-----------------------------+----------------------------------------------------------------
1.atty                       |
        any_crim#race_mode_n |
   No Criminal Charge#White  |          0  (omitted)
No Criminal Charge#Hispanic  |   .0236401   .0026327     8.98   0.000       .01848    .0288002
   No Criminal Charge#Black  |   .0559942   .0034635    16.17   0.000     .0492058    .0627825
   No Criminal Charge#Asian  |   .0527248     .00372    14.17   0.000     .0454337     .060016
      Criminal Charge#White  |  -.0572188   .0039911   -14.34   0.000    -.0650411   -.0493964
   Criminal Charge#Hispanic  |  -.0323008   .0027941   -11.56   0.000    -.0377772   -.0268244
      Criminal Charge#Black  |  -.0270664   .0036987    -7.32   0.000    -.0343157   -.0198171
      Criminal Charge#Asian  |    .010452   .0041895     2.49   0.013     .0022408    .0186632
----------------------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

Unfortunately, this command omits the AME of attorney representation for White immigrants without criminal charges.

Comment

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2078
#6

11 Dec 2024, 16:27

If those groups are exhaustive and mutually exclusive, then you lose one category -- for the same reason atty drops out. The other estimates are relative to the No Criminal Charge -- White group. So you can test differences with this base group. But the negative signs seem weird to me.
Comment

Weston Ley

Join Date: Sep 2022
Posts: 19

12 Dec 2024, 07:47

Is there any way to answer my question by using linear combinations? I tried running the code below but the suest command gives and error that my equations are estimated with a nonstandard vce (delta). I can't combine my estimates to calculate differences.

Code:

    xtreg relief_granted_narrow i.race_mode_n##i.any_crim ///
        caseload_ind_14d_med charges_in_proc had_hearing i.lang_simp  ///
        i.completion_year muslim_maj ///
        i.judge_document_number_n  i.custody_n, fe i(eoirattorneyid_n)

            est sto fig11b_removal
            
             margins if eoirattorneyid_n == 84636, over(race_mode_n any_crim) post // If no attorney
            est store     no_atty_fig11b_removal
            
            est rest fig11b_removal
            margins if eoirattorneyid_n != 84636, over(race_mode_n any_crim) post // If there is attorney
            est store     atty_fig11b_removal
            
            suest no_atty_fig11b_removal atty_fig11b_removal

Announcement

Can’t estimate AME of colinear variable used in non-colinear interaction terms

Comment

Comment

Comment

Comment

Comment

Comment