Dear STATA experts:
I’m learning STATA's maximize likelihood syntax by building my own ordered probit model using ml command.
My outcome variable is an ordinal variable k, ranges from 1- 10, which measures a latent construct (such as attitude toward a particular social issue).
My predictors are 10 dummy variables (denote as s), ranging from 1 – 10, suggesting this latent construct are surveyed for 10 years.
Assume individuals' latent attitudes are mean yearly attitude + standard normal error, I simulate my data as follows:
The STATA built-in command oprobit is fairly straightforward :
The model yields relatively comparable cut-off thresholds compared with the actual thresholds.
However, I ran into problems when I was trying to write my own likelihood program.
The program passed the Test 1 and Test 2 for ml check. However, ml search result then shows error message " Could not find feasible values".
I did some experiments, realizing that this problem only appears when I set k to have more than 7 categories. For example, when I set the k to have 5 categories ( aka having 4 thresholds). The model runs (albeit takes quite long ) and get the exact same results as oprobit .
I spent hours on this but still could not figure out what went wrong. Can anyone help me find out what went wrong? Or is there a better way to programming ordered probit model with many thresholds using ml?
I’m learning STATA's maximize likelihood syntax by building my own ordered probit model using ml command.
My outcome variable is an ordinal variable k, ranges from 1- 10, which measures a latent construct (such as attitude toward a particular social issue).
My predictors are 10 dummy variables (denote as s), ranging from 1 – 10, suggesting this latent construct are surveyed for 10 years.
Assume individuals' latent attitudes are mean yearly attitude + standard normal error, I simulate my data as follows:
Code:
clear set seed 24601 *1. simulate yearly mean attitude mat mu=J(10, 1, -99) mat mu0=J(10, 1, -99) forval i =1/10 { mat mu[`i', 1] = sin(`i') mat mu0[`i', 1]=mu[`i'-.84147098, 1] // 1st yr set to be zero } mat mu0[1, 1] = 0 svmat mu0, names(mu) rename mu1 mu gen s=_n if !missing(mu) *2. simulate individuals' latent attitude expand 100000 g y=. forvalues i = 1/10 { replace mu= mu0[`i', 1] replace y = mu + rnormal(0,1) } histogram y, by(s) *3. Simulate easurement: create 1 - 10 scale survey responses by setting-up cut-offs sum y , detail mat tau3 = ( `r(p1)' \ `r(p5)' \ `r(p10)' \ `r(p25)' \ `r(p50)' \ `r(p75)' \ `r(p90)' \ `r(p95)' \ `r(p99)' ) g k =. replace k= 1 if y< tau3[1,1] replace k= 2 if inrange(y,tau3[1,1], tau3[2,1]) replace k= 3 if inrange(y,tau3[2,1], tau3[3,1]) replace k= 4 if inrange(y,tau3[3,1], tau3[4,1]) replace k= 5 if inrange(y,tau3[4,1], tau3[5,1]) replace k= 6 if inrange(y,tau3[5,1], tau3[6,1]) replace k= 7 if inrange(y,tau3[6,1], tau3[7,1]) replace k= 8 if inrange(y,tau3[7,1], tau3[8,1]) replace k= 9 if inrange(y,tau3[8,1], tau3[9,1]) replace k =10 if y>tau3[9,1] histogram k , by(s)
Code:
oprobit k i.s
However, I ran into problems when I was trying to write my own likelihood program.
Code:
capture program drop obit program define obit version 1.0 args lnf mu tau1 tau2 tau3 tau4 tau5 tau6 tau7 tau8 tau9 // cut off point tempvar p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 quietly { gen double `p1'=0 gen double `p2'=0 gen double `p3'=0 gen double `p4'=0 gen double `p5'=0 gen double `p6'=0 gen double `p7'=0 gen double `p8'=0 gen double `p9'=0 gen double `p10'=0 replace `p1' = normal( `tau1' -`mu') replace `p2' = normal(`tau2' -`mu') - normal( `tau1' -`mu') replace `p3' = normal(`tau3' -`mu') - normal( `tau2' -`mu') replace `p4' = normal(`tau4' -`mu') - normal( `tau3' -`mu') replace `p5' = normal(`tau5' -`mu') - normal( `tau4' -`mu') replace `p6' = normal(`tau6' -`mu') - normal( `tau5' -`mu') replace `p7' = normal(`tau7' -`mu') - normal( `tau6' -`mu') replace `p8' = normal(`tau8' -`mu') - normal( `tau7' -`mu') replace `p9' = normal(`tau9' -`mu') - normal( `tau8' -`mu') replace `p10' = 1- normal( `tau9' -`mu') #delimit ; replace `lnf' = ($ML_y1 ==1) *ln(`p1') + ($ML_y1 ==2) *ln(`p2') + ($ML_y1 ==3) *ln(`p3') + ($ML_y1 ==4) *ln(`p4') + ($ML_y1 ==5) *ln(`p5') + ($ML_y1 ==6) *ln(`p6') + ($ML_y1 ==7) *ln(`p7') + ($ML_y1 ==8) *ln(`p8') + ($ML_y1 ==9) *ln(`p9') + ($ML_y1 ==10) *ln(`p10') ; #delimit cr } end ml model lf obit (k= i.s, noconstant) /// (tau1: ) (tau2: ) (tau3: ) /// (tau4: ) (tau5: ) (tau6: ) /// (tau7: ) (tau8: ) (tau9: ) ml trace on ml check ml search ml maximize , difficult mat c=r(table)
I did some experiments, realizing that this problem only appears when I set k to have more than 7 categories. For example, when I set the k to have 5 categories ( aka having 4 thresholds). The model runs (albeit takes quite long ) and get the exact same results as oprobit .
I spent hours on this but still could not figure out what went wrong. Can anyone help me find out what went wrong? Or is there a better way to programming ordered probit model with many thresholds using ml?
Comment