Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Standard deviation of regression residual using historical data

    Dear all,

    I'm trying to calculate the standard deviation of the regression residual where I require at least 3 firm-years of historical data (year t, year t-1, and year t-2) for the maximum of 5 years (year t, year t-1 until year t-4). I use the following code in Stata 16.1:

    Code:
    gen sdresidualstore=.
    gen count=.
    sort firm fyear
    by firm: gen nr=_n
    egen firmid=group(firm)
    sum firmid
    local m=r(max)
    forvalues i=1(1)`m'{
        qui sum nr if firmid==`i'
        local n=r(min)+2
        local p=r(max)
        if maxnr>2 {
            forvalues j=`n'(1)`p'{
                qui gen helpvar=1 if firmid==`i' & nr<=`j' & nr>`j'-5
                qui sum residual if hulpvar==1
                qui replace sdresidualstore=r(sd) if firmid==`i' & nr==`j'
                qui replace count=r(N) if firmid==`i' & nr==`j'
                qui drop hulpvar
                di `i' " / " `j' " / " `m'
            }
        }
    }

    To elucidate, the variable "sdresidualstore" should store the 3-5 firm-years standard deviation of the residual that I want to calculate; and the variable "count" should store the number of observations used to compute the standard deviation (i.e., minimum of 3, and maximum of 5).

    Stata does not recognize the code marked in red. I've also tried:

    Code:
    max(nr)>2 {
    and,

    Code:
    nr>2 {
    Both codes do not seem to work. Does anyone know what I did wrong here?

    Any help is much appreciated!

    Kind regards,

    Dennis








  • #2
    You will increase your chances of useful answer by following the FAQ on asking questions provide Stata code in code delimiters, readable Stata output, and sample data using dataex. Being able to replicate or run your code is often essential to helping you.

    I strongly suspect you don't need the loop. I would look into egen functions – you can create the lagged variables and then do the standard deviations using egen with rowsd. Alternatively, rangestat (user-written) may do this for you as well.

    The syntax problem if that maxnr must be a macro (it might work with a scalar). For such a logical command driving the flow of your program, you can't have a variable. And you don't seem to have a macro named maxnr. Anyway, if maxnr is a local, then you need

    if `maxnr' > 2 {

    To set a local to the maximum of value of x, you need something like:
    gsort - x
    local a=q[1}

    or
    su x
    local q=r(max)

    Comment

    Working...
    X