Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating Robust Estimators using Mata

    Howdy,

    I'm looking for suggestions on how to compute the Andrews (1991) robust estimator using Mata that I get in Stata by appending vce(hc3) to a regression. Any guidance is greatly appreciated.

    As an example, I have computed my White estimator [vce(robust)] using this snippet:

    x_vars = st_data(.,("education","experience","exp2"))
    cons = J(rows(x_vars),1,1)
    X = (x_vars,cons)
    y= st_data(.,("wage"))
    B_hat=(invsym(X'*X))*(X'*y)
    e_hat=y-X*B_hat
    sandwich_mid = J(cols(X), cols(X), 0)
    n=rows(X)
    for (i=1; i<=n; i++) {
    sandwich_mid =sandwich_mid+(e_hat[i,1]*X[i,.])'*(e_hat[i,1]*X[i,.])
    }
    V_robust=(n/(n-cols(X)))*invsym(X'*X)*sandwich_mid*invsym(X'*X)
    se_robust=sqrt(diagonal(V_robust))
    B_hat
    se_robust

    Cheers,
    Nik

  • #2
    My suggestion is to use the cross() and quadcross() functions. There are designed to compute cross-products (as transposing is a costly operation) and you avoid the loop when computing the sandwich matrix. See my modifications of your code below
    Christophe


    Code:
    sysuse nlsw88
    
    global x  age ttl_exp // ("education","experience","exp2")
    
    mata:
    x_vars = st_data(.,"$x") // <-new
    cons = J(rows(x_vars),1,1)
    X = (x_vars,cons)
    y= st_data(.,("wage"))
    B_hat=(invsym(X'*X))*(X'*y)
    B_hat
    B_hat = qrsolve(quadcross(x_vars,1,x_vars,1),quadcross(x_vars,1,y,0)) // <-new
    B_hat
    e_hat=y-X*B_hat
    
    sandwich_mid = J(cols(X), cols(X), 0)
    n=rows(X)
    for (i=1; i<=n; i++) {
    sandwich_mid =sandwich_mid+(e_hat[i,1]*X[i,.])'*(e_hat[i,1]*X[i,.])
    }
    // V_robust=(n/(n-cols(X)))*invsym(X'*X)*sandwich_mid*invsym(X'*X)
    XpX = quadcross(x_vars,1,x_vars,1) // <- new
    V_robust = (n/(n-cols(X)))*invsym(XpX)*sandwich_mid*invsym(XpX) // <-new
    se_robust=sqrt(diagonal(V_robust))
    B_hat
    se_robust
    
    // alternative computation of the covariance matrix
    V_robust
    k = cols(x_vars)+1
    XpX = quadcross(x_vars,1,x_vars,1)
    V_robustCross = n/(n-k)*invsym(XpX)*quadcross(x_vars,1,e_hat:^2,x_vars,1)*invsym(XpX)
    V_robustCross
    
    // there are differences but they are minimal
    V_robust-V_robustCross
    quadcross(x_vars,1,e_hat:^2,x_vars,1)-sandwich_mid
    end

    Comment


    • #3
      Ah, great. I'm new to Stata - so many functions to learn. Now I just need to figure out how to compute the prediction errors and standardized residuals. Thanks for the input!

      Comment


      • #4
        For whatever it's worth, cross() seems to be considerably slower than the naive approach or using a colon operator when taking the sum of squares of a vector, which does not preclude it being useful, but which did surprise me:
        Code:
        mata:
        reps = 10000
        d = runiform(2000,1)
        timer_clear(1)
        timer_on(1)
        for ( i = 1; i <= reps; i++) {
               x = d' * d
        }
        timer_off(1)
        naive = timer_value(1)[1,1]
        //
        timer_clear(1)
        timer_on(1)
        for ( i = 1; i <= reps; i++) {
               x = cross(d,d)
        }
        timer_off(1)
        cross = timer_value(1)[1,1]
        //
        timer_off(1)
        //
        timer_clear(1)
        timer_on(1)
        for ( i = 1; i <= reps; i++) {
               x = (d :* d)
        }
        timer_off(1)
        timer_value(1)
        colon = timer_value(1)[1,1]
        //
        "naive, cross, colon: " + strofreal(naive) + ", " + strofreal(cross) + ", " + strofreal(colon)
        
        end
        //
        Regards, Mike

        Comment


        • #5
          Well yes, that's surprising, but good to know. I shall remember that :-). But shouldn't it be

          Code:
          x = sum(d:*d)
          
          // or 
          
          x = sum(d:^2)
          ?

          Comment


          • #6
            Fortunately cross is faster when using matrices. I guess it is also faster when making computations like X'WX where W is an NxN diagonal matrix. See help mata cross for further details.

            Code:
            clear mata
            mata:
            
            time = J(1,2,.)
            X = runiform(10000,50)
            reps = 1000
            
            timer_on(1)
            for (i=1;i<=reps;i++)
            {
                    cross = X'*X
                    
            }
            timer_off(1)
            time[1,1]=timer_value(1)[1,1]
            timer_clear(1)
            
            timer_on(1)
            for (i=1;i<=reps;i++)
            {
                    cross = cross(X,X)
            }
            
            timer_off(1)
            time[1,2]=timer_value(1)[1,1]
            timer_clear(1)
            
            st_matrix("time",time)
            end
            
            mat coln time = XpX Cross
            mat rown time = time
            matlist time
            Code:
                         |       XpX      Cross 
            -------------+----------------------
                    time |    33.831     12.323

            Comment

            Working...
            X