Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • bugfixing why initial values are not feasible for optimize()

    Hi,
    I am using Stata 14.1 MP on mac and unix, and the ado file I painstakingly tested on mock data fails upon its first encounter with enemy fire (IRL data). As you might be aware, Mata can be cryptic with its error messages, and currently I am looking for strategies how to understand what's wrong with the code (or the data). Below is the Mata code, and I printed the parameter values calculated before optimize would (presumably) evaluate the objective function for the first time but (presumably) fails. Those values are:

    muhat0: 0.5536
    C: 887.95
    p: 2975
    varmu: 0.034
    gammahat0: 6.27

    These are similar to those in my simulation, muhat0: .501235, C: 2.866523, p: 316, varmu: .0051211, gammahat0: 47.8172. So I am not sure what is "infeasible" about these starting values.

    FYI, the code is part of an adaptation of Semiparametric SURE shrinkage of binomial means, cf. "Optimal Shrinkage Estimation of Mean Parameters in Family of Distributions with Quadratic Variance" by Xie, Kou and Brown.


    Thanks for any thoughts,

    Laszlo

    Code:
    mata:
    
    void function minsureNEFQVF2(    string scalar rname,
                                    string scalar Yname,
                                    string scalar Wname,
                                    string scalar gammahatname,
                                    string scalar muhatname,
                                    string scalar tousename)
    {
    real vector r
    st_view(r,.,rname,tousename)
    real rowvector nu
    nu = (0, 1, - 1)
    real vector Y
    st_view(Y,.,Yname,tousename)
    real vector W
    if (Wname == "") W = J(length(r),1,1)
    else  st_view(W,.,Wname,tousename)
    
    real scalar nu0, nu1, nu2
    nu0 = nu[1]
    nu1 = nu[2]
    nu2 = nu[3]
    
    real vector mu2hat
    mu2hat = (r :* Y :^ 2 :- nu1 :* Y :- nu0) :/ (r :+ nu2)
    
    real scalar muhat0, muhat, C, p, varmu, gammahat0, gammahat
    muhat0 = mean(Y)
    C = quadsum(1 :/ r)
    p = length(r)
    varmu = ((quadsum(Y :^ 2) :- (nu0 :+ nu1 :* muhat0) :* C :- muhat0:^2 :* (p :+ nu2 :* C))) :/ (p :+ nu2 :* C)
    gammahat0 = max( (epsilon(1), (nu0 :+ nu1 :* muhat0 :+ nu2 :* muhat0 :^2) :/ varmu :+ nu2) )
    
    printf("muhat0 is %9.0g, C is %9.0g, p is %9.0g, varmu is %9.0g, gammahat0 is %9.0g.",muhat0,C,p,varmu,gammahat0)
    
    transmorphic matrix S
    S = optimize_init()
    optimize_init_which(S ,"min")
    optimize_init_evaluator(S, &surerisk()  )
    optimize_init_evaluatortype(S, "gf0")
    optimize_init_params(S, gammahat0)
    optimize_init_argument(S,1,r)
    optimize_init_argument(S,2,Y)
    optimize_init_argument(S,3,W)
    optimize_init_argument(S,4,mu2hat)
    optimize_init_singularHmethod(S,"hybrid")
    optimize_init_technique(S, "nr dfp bfgs bhhh")
    optimize(S)
    gammahat = optimize_result_params(S)
    
    muhat = quadsum((gammahat :/ (r :+ gammahat)) :^ 2 :* Y :* W) / quadsum((gammahat :/ (r :+ gammahat)) :^ 2 :* W) 
    
    st_numscalar(muhatname,muhat)
    st_numscalar(gammahatname,gammahat)
    }
    
    void function surerisk(    todo,
                            real scalar r0,
                            real vector r,
                            real vector Y,
                            real vector W,
                            real vector mu2hat,
                            real vector res,
                            g,
                            H)
    {
            real scalar mu0
            if (r0 < 0) res = J(rows(r),1,-r0)
            else
            {
               mu0 = quadsum((r0 :/ (r :+ r0)) :^ 2 :* Y :* W) / quadsum((r0 :/ (r :+ r0)) :^ 2 :* W)
               res = (((Y :* r :+ r0 :* mu0) :/ (r :+ r0)) :^ 2 - 2 :* r0 :* mu0 :/ (r :+ r0) :* Y - (r :- r0) :/ (r :+ r0) :* mu2hat) :* W
            }
    }
    
    end

  • #2
    If it infeasible then the likelihood returns missing whatever values you give. So I would first check the surerisk function for itself first and why res gives missing. It could be a problem with the inputs.

    Comment


    • #3
      Thanks, Christophe. So for any r0, surerisk returns .? That's what I thought, but I see no reason for it in the formula. Unless the data vectors r, Y and W can have missing values, but I thought using touse in st_view takes care of that.
      Last edited by László Sándor; 15 Mar 2016, 05:48.

      Comment


      • #4
        touse will help to pick up observations where touse is different from 0 (including missing). So make sure it is either 0 or 1. But even in this case it could be that some observations have missing values for r, Y and W (you can check that with missing()). It depends on how you have constructed your touse variable. Finally it can be that res results with missing if you divide by 0 for example.
        Last edited by Christophe Kolodziejczyk; 15 Mar 2016, 08:16.

        Comment


        • #5
          Thanks, I was not dividing by zero, but I did have missing values in some of the matrices — I forgot (or never knew) that -marksample- does not mark out observations for which variables mentioned/specified in *options* are missing, and that's where I was passing on Y and r to my ado file!

          Comment

          Working...
          X