Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ML command could not find feasible values

    Dear STATA experts:

    I’m learning STATA's maximize likelihood syntax by building my own ordered probit model using ml command.

    My outcome variable is an ordinal variable k, ranges from 1- 10, which measures a latent construct (such as attitude toward a particular social issue).

    My predictors are 10 dummy variables (denote as s), ranging from 1 – 10, suggesting this latent construct are surveyed for 10 years.

    Assume individuals' latent attitudes are mean yearly attitude + standard normal error, I simulate my data as follows:

    Code:
    clear
    set seed 24601
    
    *1. simulate yearly  mean attitude
    
    mat mu=J(10, 1, -99)
    mat mu0=J(10, 1, -99)
    
        forval i =1/10 {
          mat mu[`i', 1] = sin(`i')
          mat mu0[`i', 1]=mu[`i'-.84147098, 1]  //  1st yr set to be zero
        }
        
        mat mu0[1, 1] = 0
    
        svmat mu0, names(mu)
        rename mu1 mu
        gen s=_n if !missing(mu)
     
    *2.  simulate individuals'  latent attitude  
    
     expand  100000
      g y=.
      forvalues i = 1/10 {
      replace mu= mu0[`i', 1]
      replace y = mu + rnormal(0,1)
      
      }
      
      histogram y, by(s)
      
     
        
     *3. Simulate easurement: create 1 - 10 scale  survey responses by setting-up cut-offs
      sum y , detail
      mat tau3 = ( `r(p1)'  \ `r(p5)' \ `r(p10)'  \   `r(p25)'   \  `r(p50)'  \  `r(p75)' \  `r(p90)'  \  `r(p95)' \  `r(p99)' )
     
      g k =.
      replace k= 1 if y< tau3[1,1]                    
      replace k= 2 if inrange(y,tau3[1,1], tau3[2,1])
      replace k= 3 if inrange(y,tau3[2,1], tau3[3,1])
      replace k= 4 if inrange(y,tau3[3,1], tau3[4,1])
      replace k= 5 if inrange(y,tau3[4,1], tau3[5,1])
      replace k= 6 if inrange(y,tau3[5,1], tau3[6,1])
      replace k= 7 if inrange(y,tau3[6,1], tau3[7,1])
      replace k= 8 if inrange(y,tau3[7,1], tau3[8,1])
      replace k= 9 if inrange(y,tau3[8,1], tau3[9,1])
      replace k =10 if y>tau3[9,1]    
    
      histogram k , by(s)
    The STATA built-in command oprobit is fairly straightforward :

    Code:
     oprobit k i.s
    The model yields relatively comparable cut-off thresholds compared with the actual thresholds.


    However, I ran into problems when I was trying to write my own likelihood program.
    Code:
     capture program drop obit
    
      program define obit
       version 1.0
        args lnf mu tau1 tau2  tau3 tau4 tau5  tau6 tau7 tau8 tau9   // cut off point
      
        tempvar p1 p2 p3 p4 p5 p6 p7 p8 p9 p10
    
      quietly {
        
        gen double `p1'=0
        gen double `p2'=0
        gen double `p3'=0
        gen double `p4'=0
        gen double `p5'=0
        gen double `p6'=0
        gen double `p7'=0
        gen double `p8'=0
        gen double `p9'=0
        gen double `p10'=0
    
        replace `p1' = normal( `tau1' -`mu')                          
        replace `p2' = normal(`tau2' -`mu')  - normal( `tau1' -`mu')
        replace `p3' = normal(`tau3' -`mu')  - normal( `tau2' -`mu')  
        replace `p4' = normal(`tau4' -`mu')  - normal( `tau3' -`mu')  
        replace `p5' = normal(`tau5' -`mu')  - normal( `tau4' -`mu')  
        replace `p6' = normal(`tau6' -`mu')  - normal( `tau5' -`mu')  
        replace `p7' = normal(`tau7' -`mu')  - normal( `tau6' -`mu')  
        replace `p8' = normal(`tau8' -`mu')  - normal( `tau7' -`mu')  
        replace `p9' = normal(`tau9' -`mu')  - normal( `tau8' -`mu')
        replace `p10' = 1- normal( `tau9' -`mu')                      
        
        #delimit ;
        replace `lnf' = ($ML_y1 ==1) *ln(`p1') +
                        ($ML_y1 ==2) *ln(`p2') +
                        ($ML_y1 ==3) *ln(`p3') +
                        ($ML_y1 ==4) *ln(`p4') +
                        ($ML_y1 ==5) *ln(`p5') +
                        ($ML_y1 ==6) *ln(`p6') +
                        ($ML_y1 ==7) *ln(`p7') +
                        ($ML_y1 ==8) *ln(`p8') +
                        ($ML_y1 ==9) *ln(`p9') +
                        ($ML_y1 ==10) *ln(`p10')
        
        ;
       #delimit cr
        
      }
     end    
        
            
           ml model lf obit (k= i.s, noconstant)  ///
                         (tau1: )  (tau2: )  (tau3: ) ///
                         (tau4: )  (tau5: )  (tau6: ) ///
                         (tau7: )  (tau8: ) (tau9: )  
                
       ml trace on                      
       ml check    
       ml search    
       ml maximize , difficult
       mat c=r(table)
    The program passed the Test 1 and Test 2 for ml check. However, ml search result then shows error message " Could not find feasible values".

    I did some experiments, realizing that this problem only appears when I set k to have more than 7 categories. For example, when I set the k to have 5 categories ( aka having 4 thresholds). The model runs (albeit takes quite long ) and get the exact same results as oprobit .

    I spent hours on this but still could not figure out what went wrong. Can anyone help me find out what went wrong? Or is there a better way to programming ordered probit model with many thresholds using ml?
    Last edited by Donghui Wang; 10 Mar 2020, 14:42.

  • #2
    Hi Donghui
    I have experienced the problem you describe quite often myself. It usually can be solved, as you may suspect, by choosing "good" initial values. That is the challenge.
    The way you are setting your model the initial values for all the Tau's are zero. This probably makes the Loglikelihood unfeasible at the beginning. It doesnt help that you are also trying to estimate the model for a very large sample (1000000 observation).
    So, i would do something like, providing Tau's that are increasing and different from each other, and re-estimate the model with only a fraction of the sample.
    Once the "subsample" has been estimated, you can go ahead and reestimate the full sample model.
    HTH
    Fernando

    Comment


    • #3
      Thanks a lot !

      Comment


      • #4
        Just a quick update, here's the reivsed program following Fernado's suggestion. The estimated parameters are the same copmared with stata's built-in oprobit k i.s results


        Code:
            ml model lf obit (k: k= i.s, noconstant)  ///
                             (tau1:)  (tau2:)  (tau3:) ///
                             (tau4:) (tau5:) (tau6:) (tau7:) (tau8:) (tau9:)
        
        
                             
           #delimit ;  
           ml init 
              tau1:_cons = -1.5
              tau2:_cons = -1.3
              tau3:_cons = -1.0 
              tau4:_cons = -0.5
              tau5:_cons =  0.5
              tau6:_cons = 1.0
              tau7:_cons = 1.5
              tau8:_cons = 2.0
              tau9:_cons = 2.5
           ;
           #delimit cr
        
           ml maximize, difficult

        Comment

        Working...
        X