Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Behavior of tempvar within ML program

    Suppose I know the mean of a normal variable and I want to estimate its standard deviation using MLE in Stata. My data looks similar to the following simulated data (it is messy panel data with missing values, etc.):

    Code:
    clear all
    
    *** Generate panel data
    set seed 1339
    set obs 100
    g id =_n
    expand 20
    bys id: g t = _n 
    xtset id t 
    
    g eps = rnormal(0,1)
    g x = rnormal(0,1)
    replace x = . if x<0 
    g y = 5 + 3*L1.x + eps
    I then tried to estimate the standard deviation using:

    Code:
    capture program drop myMLprog
    program myMLprog
      version 14
      args lnf sig
      local y "$ML_y1"
    
      tempvar mu
      qui gen `mu' = 5 +3*L1.x
      qui replace `lnf' = ln(normalden(`y',`mu',sqrt(exp(2*`sig')))) 
    end 
    
    ml model lf myMLprog (sig: y=) 
    ml maximize
    display exp([sig]_cons)
    which gives the error that the log likelihood could not be evaluated. I then put the calculation of the mean outside of the program and it worked (estimated sd of .98790158):


    Code:
    g mu= 5 +3*L1.x
    
    capture program drop myMLprog2
    program myMLprog2
      version 14
      args lnf sig
      local y "$ML_y1"
    
      //tempvar mu
      //qui gen `mu' = 5 +3*L1.x
      qui replace `lnf' = ln(normalden(`y',mu,sqrt(exp(2*`sig')))) 
    end 
    
    ml model lf myMLprog2 (sig: y=) 
    ml maximize
    display exp([sig]_cons)
    This is a simplified problem of the actual model that I am trying to estimate. I think I need to understand why the solution with the tempvar does not work in this case to tackle the more complicated problem. I think it has to do with some of the missing values in the data and/or am I missing something obvious?



  • #2
    I am not exactly sure why but it seems that Stata prefers both y and x to be non-missing and it somehow has a problem working with the operator L1 in the program. If any body encounters a similar problem, it seems like moving the lag operator out of the program provides a fix:


    Code:
    g Lx = L1.x
    
    capture program drop myMLprog
    program myMLprog
      version 14
      args lnf sig
      local y "$ML_y1"
    
      tempvar mu
      qui gen `mu' = 5 +3*Lx
      qui replace `lnf' = ln(normalden(`y',`mu',sqrt(exp(2*`sig')))) 
    end 
    
    ml model lf myMLprog (sig: y=) 
    ml maximize
    display exp([sig]_cons)

    The problem is not entirely due to the lag operator though, as for instance the following (Yes, wrong conditional mean) will still give an error. I guess this ultimately means it has something to do with which x and y observations are missing in the data.

    Code:
    capture program drop myMLprog
    program myMLprog
      version 14
      args lnf sig
      local y "$ML_y1"
    
      tempvar mu
      qui gen `mu' = 5 +3*x
      qui replace `lnf' = ln(normalden(`y',`mu',sqrt(exp(2*`sig')))) 
    end 
    
    ml model lf myMLprog (sig: y=) 
    ml maximize
    display exp([sig]_cons)
    if you restrict the sample to x!=. & y!=. it will work.

    Comment

    Working...
    X