How to solve inconsistent estimation sample levels for categorical variables?

Jose Williams

Join Date: Dec 2021
Posts: 49

How to solve inconsistent estimation sample levels for categorical variables?

07 Mar 2022, 04:08

Hi,

I am running a logit regression and then calculating margins based on it by adding noestimcheck. However, I keep on encountering the error: inconsistent estimation sample levels 3 and 1 of factor new_age

New_age is a categorical variable taking 4 values based on different age categories. When I do not introduce separate dummies for this variable, I get an error for the state variable: inconsistent estimation sample levels 1 and 2 of factor new_state.

The codes that I am using:

Code:

logit burden ib(7).ind_new ib(3).new_age i.new_state , vce (robust)
margins, dydx(*) noestimcheck

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte burden long ind_new byte new_age long new_state
0 1 2 16
0 1 2 16
0 8 4 16
0 4 1 16
0 6 1 16
0 8 4 19
0 7 4  6
0 3 3 17
0 1 3 15
0 1 4  3
0 4 4 10
0 4 4 22
0 6 2  1
0 3 4 23
0 1 4 15
0 7 4 16
0 2 3  3
0 8 2 13
0 3 3 15
0 8 3 13
0 3 3 13
0 3 4 19
0 2 3 23
0 2 2  5
0 2 2  5
0 7 4  8
0 6 1  8
0 8 2 22
0 7 4  5
0 8 3 21
0 8 4 22
0 1 4 19
0 8 1 19
0 1 3 20
0 8 4 10
0 8 2 10
0 . 4 23
0 4 4 21
0 1 3 21
0 4 1 23
0 4 1 23
0 2 2 25
0 2 2 25
0 4 4  9
0 8 2 23
0 2 1 23
0 2 1 23
0 1 3 19
0 8 4 16
0 8 4 23
1 3 1  6
0 8 4  9
0 1 4 20
0 1 1 20
0 1 4 21
0 8 1 21
0 6 2  3
0 3 2 13
0 3 2 13
0 8 2  3
0 1 3  5
0 3 3 23
0 3 4 23
0 8 3 23
0 4 2 13
0 8 4 20
0 6 1  7
0 4 3 22
0 4 4 23
0 2 1 23
0 2 1 23
0 1 4  3
0 7 4 15
0 2 3 13
0 2 2  3
0 7 4 17
0 7 3 17
0 2 4 25
0 4 2  2
0 6 3  7
0 8 3 23
0 3 4 20
0 8 2 23
0 2 2  5
0 8 4 25
0 2 3 15
0 6 3 22
0 1 3 22
0 6 4 24
0 7 4  4
0 8 2 16
0 6 4 23
0 6 1 23
0 6 1 23
0 8 4  1
0 8 4 23
0 8 2 23
0 7 3  3
0 2 3 22
0 3 2 22
end
label values ind_new ind_new
label def ind_new 1 "Agriculture", modify
label def ind_new 2 "Construction", modify
label def ind_new 3 "Education and Health Care", modify
label def ind_new 4 "Manufacturing", modify
label def ind_new 6 "Personal Non-Professional Services", modify
label def ind_new 7 "Services - Modern", modify
label def ind_new 8 "Trade, Hotels, Restaurants, Communication", modify
label values new_state new_state
label def new_state 1 "Andhra Pradesh", modify
label def new_state 2 "Assam", modify
label def new_state 3 "Bihar", modify
label def new_state 4 "Chandigarh", modify
label def new_state 5 "Chhattisgarh", modify
label def new_state 6 "Delhi", modify
label def new_state 7 "Goa", modify
label def new_state 8 "Gujarat", modify
label def new_state 9 "Haryana", modify
label def new_state 10 "Himachal Pradesh", modify
label def new_state 13 "Karnataka", modify
label def new_state 15 "Madhya Pradesh", modify
label def new_state 16 "Maharashtra", modify
label def new_state 17 "Odisha", modify
label def new_state 19 "Punjab", modify
label def new_state 20 "Rajasthan", modify
label def new_state 21 "Tamil Nadu", modify
label def new_state 22 "Telangana", modify
label def new_state 23 "Uttar Pradesh", modify
label def new_state 24 "Uttarakhand", modify
label def new_state 25 "West Bengal", modify

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17606
#2

07 Mar 2022, 05:22

Jose:
I was not abke to replicate the error your're complaining about.
With a bit of guess-work, it may be a matter of leading and/or trailing blanks.
That said (but what follows is probably releted to your data excerpt and does not affect the original dataset):

Code:

. logit burden ib(7).ind_new ib(3).new_age i.new_state , vce (robust) note: 1.ind_new != 0 predicts failure perfectly; 1.ind_new omitted and 15 obs not used. note: 2.ind_new != 0 predicts failure perfectly; 2.ind_new omitted and 16 obs not used. note: 3.ind_new != 1 predicts failure perfectly; 3.ind_new omitted and 56 obs not used. outcome = new_age > 0 predicts data perfectly r(2000);

Threfore, no coefficient is returned.

Kind regards,
Carlo
(StataNow 18.5)
Comment

Announcement

How to solve inconsistent estimation sample levels for categorical variables?

Comment