Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to solve inconsistent estimation sample levels for categorical variables?

    Hi,

    I am running a logit regression and then calculating margins based on it by adding noestimcheck. However, I keep on encountering the error: inconsistent estimation sample levels 3 and 1 of factor new_age

    New_age is a categorical variable taking 4 values based on different age categories. When I do not introduce separate dummies for this variable, I get an error for the state variable: inconsistent estimation sample levels 1 and 2 of factor new_state.

    The codes that I am using:
    Code:
    logit burden ib(7).ind_new ib(3).new_age i.new_state , vce (robust)
    margins, dydx(*) noestimcheck

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte burden long ind_new byte new_age long new_state
    0 1 2 16
    0 1 2 16
    0 8 4 16
    0 4 1 16
    0 6 1 16
    0 8 4 19
    0 7 4  6
    0 3 3 17
    0 1 3 15
    0 1 4  3
    0 4 4 10
    0 4 4 22
    0 6 2  1
    0 3 4 23
    0 1 4 15
    0 7 4 16
    0 2 3  3
    0 8 2 13
    0 3 3 15
    0 8 3 13
    0 3 3 13
    0 3 4 19
    0 2 3 23
    0 2 2  5
    0 2 2  5
    0 7 4  8
    0 6 1  8
    0 8 2 22
    0 7 4  5
    0 8 3 21
    0 8 4 22
    0 1 4 19
    0 8 1 19
    0 1 3 20
    0 8 4 10
    0 8 2 10
    0 . 4 23
    0 4 4 21
    0 1 3 21
    0 4 1 23
    0 4 1 23
    0 2 2 25
    0 2 2 25
    0 4 4  9
    0 8 2 23
    0 2 1 23
    0 2 1 23
    0 1 3 19
    0 8 4 16
    0 8 4 23
    1 3 1  6
    0 8 4  9
    0 1 4 20
    0 1 1 20
    0 1 4 21
    0 8 1 21
    0 6 2  3
    0 3 2 13
    0 3 2 13
    0 8 2  3
    0 1 3  5
    0 3 3 23
    0 3 4 23
    0 8 3 23
    0 4 2 13
    0 8 4 20
    0 6 1  7
    0 4 3 22
    0 4 4 23
    0 2 1 23
    0 2 1 23
    0 1 4  3
    0 7 4 15
    0 2 3 13
    0 2 2  3
    0 7 4 17
    0 7 3 17
    0 2 4 25
    0 4 2  2
    0 6 3  7
    0 8 3 23
    0 3 4 20
    0 8 2 23
    0 2 2  5
    0 8 4 25
    0 2 3 15
    0 6 3 22
    0 1 3 22
    0 6 4 24
    0 7 4  4
    0 8 2 16
    0 6 4 23
    0 6 1 23
    0 6 1 23
    0 8 4  1
    0 8 4 23
    0 8 2 23
    0 7 3  3
    0 2 3 22
    0 3 2 22
    end
    label values ind_new ind_new
    label def ind_new 1 "Agriculture", modify
    label def ind_new 2 "Construction", modify
    label def ind_new 3 "Education and Health Care", modify
    label def ind_new 4 "Manufacturing", modify
    label def ind_new 6 "Personal Non-Professional Services", modify
    label def ind_new 7 "Services - Modern", modify
    label def ind_new 8 "Trade, Hotels, Restaurants, Communication", modify
    label values new_state new_state
    label def new_state 1 "Andhra Pradesh", modify
    label def new_state 2 "Assam", modify
    label def new_state 3 "Bihar", modify
    label def new_state 4 "Chandigarh", modify
    label def new_state 5 "Chhattisgarh", modify
    label def new_state 6 "Delhi", modify
    label def new_state 7 "Goa", modify
    label def new_state 8 "Gujarat", modify
    label def new_state 9 "Haryana", modify
    label def new_state 10 "Himachal Pradesh", modify
    label def new_state 13 "Karnataka", modify
    label def new_state 15 "Madhya Pradesh", modify
    label def new_state 16 "Maharashtra", modify
    label def new_state 17 "Odisha", modify
    label def new_state 19 "Punjab", modify
    label def new_state 20 "Rajasthan", modify
    label def new_state 21 "Tamil Nadu", modify
    label def new_state 22 "Telangana", modify
    label def new_state 23 "Uttar Pradesh", modify
    label def new_state 24 "Uttarakhand", modify
    label def new_state 25 "West Bengal", modify

  • #2
    Jose:
    I was not abke to replicate the error your're complaining about.
    With a bit of guess-work, it may be a matter of leading and/or trailing blanks.
    That said (but what follows is probably releted to your data excerpt and does not affect the original dataset):
    Code:
    . logit burden ib(7).ind_new ib(3).new_age i.new_state , vce (robust)
    
    note: 1.ind_new != 0 predicts failure perfectly;
          1.ind_new omitted and 15 obs not used.
    
    note: 2.ind_new != 0 predicts failure perfectly;
          2.ind_new omitted and 16 obs not used.
    
    note: 3.ind_new != 1 predicts failure perfectly;
          3.ind_new omitted and 56 obs not used.
    
    outcome = new_age > 0 predicts data perfectly
    r(2000);
    Threfore, no coefficient is returned.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment

    Working...
    X