Dear Statalisters
I'm trying to implement the Model of Hausman et al (1998) for parametric estimation of binary-choice model with misclassification in the dependent variable. This model estimates two parameters (alpha0 and alpha1) that give the probabilities for misclassification. To make sure the parameters are identified, Hausman et al assume that alpha0 + alpha1 < 1.
Alpha0: probability that a zero is misclassified as a one;
Alpha1: probability that a one is misclassified as a zero.
My ml implementation works with simulated data if fed initial values for the alphas (see below). However, using it on data that I collected using a machine-learning algorithm, I find that alpha0 is estimated to be negative (but not significantly different from zero). When using the module MRPROBIT (http://ideas.repec.org/c/boc/bocode/s457657.html) I find misclassification probabilites of alpha0 = alpha1 = 8%.
Now I want to try to make sure alphas stay between 0 and 0.5. I do this by transforming the alphas in the likelihood function with "invlogit(alpha)/2" (thanks to the author of MRPROBIT for this idea). I then try to transform the alphas back, using "ml display, diparm". However, I only get StdErrs and no coefficients.. I'm not sure where my mistake lies (miscalculation?). Is it even necessary to do this?
Thank you in advance for any help or hints you can offer.
I'm pretty new to STATA ml programming, so please don't hesitate to point out stupid mistakes. 
Here's my code for STATA 12:
I'm trying to implement the Model of Hausman et al (1998) for parametric estimation of binary-choice model with misclassification in the dependent variable. This model estimates two parameters (alpha0 and alpha1) that give the probabilities for misclassification. To make sure the parameters are identified, Hausman et al assume that alpha0 + alpha1 < 1.
Alpha0: probability that a zero is misclassified as a one;
Alpha1: probability that a one is misclassified as a zero.
My ml implementation works with simulated data if fed initial values for the alphas (see below). However, using it on data that I collected using a machine-learning algorithm, I find that alpha0 is estimated to be negative (but not significantly different from zero). When using the module MRPROBIT (http://ideas.repec.org/c/boc/bocode/s457657.html) I find misclassification probabilites of alpha0 = alpha1 = 8%.
Now I want to try to make sure alphas stay between 0 and 0.5. I do this by transforming the alphas in the likelihood function with "invlogit(alpha)/2" (thanks to the author of MRPROBIT for this idea). I then try to transform the alphas back, using "ml display, diparm". However, I only get StdErrs and no coefficients.. I'm not sure where my mistake lies (miscalculation?). Is it even necessary to do this?
Thank you in advance for any help or hints you can offer.


Here's my code for STATA 12:
Code:
* Simulate Logit with misclassification of dependent variable clear //set random number seed set seed 10 set obs 10000 * some explanatory variables gen x1 = rnormal() gen x2 = rnormal() * linear combination gen z = 1 + 5*x1 + 8*x2 * Logit or Probit *logit gen pr = exp(z)/(1+exp(z)) *or probit (used for testing module mrprobit) *gen pr = normal(z) * benroulli respone gen y_ideal = rbinomial(1, pr) * Misclassification * ================= * Code pieces from * http://www.econometricsbysimulation.com/2012/06/linear-probability-model-lpm-under.html * some % of our observations have misclassified the y values. gen misclassified = rbinomial(1, .15) * The following makes the observed y equal to the ideal when misclassification is not present. gen y_observed = y_ideal if misclassified==0 * Otherwise replace with misclassified value replace y_observed = mod(y_ideal+1,2) if misclassified==1 set more off * Coefficients under misclassification logit y_observed x* * Hausmann et al. (1998) parametric estimation for misclassification capture program drop hausmanlogit program define hausmanlogit args lnf xb alpha0 alpha1 local y "$ML_y1" tempvar p * Similar to: quietly gen double `p'=1/(1+exp(-`xb')) quietly gen double `p'=invlogit(`xb') * Returns reversed signs if not fed good initial values but works. *quietly replace `lnf'=ln(`alpha0' + (1-`alpha0'-`alpha1')*`p') if `y'==1 *quietly replace `lnf'=ln(1-`alpha0' - (1-`alpha0'-`alpha1')*`p') if `y'==0 * make sure alphas stay between 0 and 0.5 quietly replace `lnf'=ln(invlogit(`alpha0')/2 + (1-invlogit(`alpha0')/2-invlogit(`alpha1')/2)*`p') if `y'==1 quietly replace `lnf'=ln(1-invlogit(`alpha0')/2 - (1-invlogit(`alpha0')/2-invlogit(`alpha1')/2)*`p') if `y'==0 end * If executed without init, signs on coefficients are reversed ml model lf hausmanlogit (y_observed = x1 x2) /alpha0 /alpha1 ml check ml init /alpha0=0 /alpha1=0 ml maximize, difficult * transform alphas back ml display, diparm(alpha0, f(-ln((-2*@-1)/(2*@))) d(1/(@-2*(@^2))) prob) diparm(alpha1, f(-ln((-2*@-1)/(2*@))) d(1/(@-2*(@^2))) prob)