Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    But, it is surprising why also the code in #18 that is the 5 year window rolling resulted in fitted values and residuals starting from the third year in the panel ! Something needs to be clarified or fixed here!

    Comment


    • #32
      The -rangestat- command for the rolling regression specifies interval(time, -4, 0). So this means that will be doing the regressions on whatever observations there happen to be in the interval between year-4 and year (inclusive). Even in your first and second years, there are, presumably, some such observations. So no years are excluded from the calculations altogether. I think that the non-missing output only starts a bit later than that because the number of observations in those first couple of years is too small to produce results given the number of predictors in your regression command. You can't regress on three predictors, for example, with only 2 observations. So I think the non-missing-value output starts in the first year where the number of observations is large enough to get a non-singular X'X matrix with the number of predictors you are using.

      Comment


      • #33
        In correspondence to post #32:
        if this is the case, then code in post #18 using the rolling regression does not using a 5 year window to produce fitted and residual values for observations in the last year of the regression window !!
        In other words, in the first 4 years, there should not be any produced fitted values or residuals for the observations in these years.

        Do I simply fix this by running:
        replace fitted=. if yr==1991|yr==1992|yr==1993|yr==1994
        replace residual=. if yr==1991|yr==1992|yr==1993|yr==1994

        Is this the solution for the rolling or the recursive regression in #18 and #19??

        Comment


        • #34
          Again, read the description of the interval() option in the help file. If the value of year for the current observation is 1996, then
          Code:
          interval(year -4 0)
          will pick up observations (from the same company) that have values for the key variable year that are inrange(year, 1992, 1996). I count 5 possible values that fall within the interval: 1992, 1993, 1994, 1995, 1996. If nobs < 5, then you have missing values, either because your data does not extend that far back, your data is not balanced, or you have missing values in your dependent or independent variables.

          You do no show an example of your data so I can't help beyond that.

          With respect to #19, I would not describe it as running a cross-sectional regression. The regressions are calculated, for the current observation, using observations from the same company. It makes absolutely no sense to do a regression on just one observation (even though you can calculate coefficients). It's above my pay grade to determine how many observations are needed before this exercise to make sense.

          Comment


          • #35
            An example of data is attached. Hope this will help!

            my dependent variable is "annual_ret" , my independent variables are "A , B, C, D, E"
            There are missing values in independent variables
            Panel data is "unbalanced"


            Can the "same company" restriction be relaxed ? i.e. simply run the regression for the window and then estimate the coefficients of the variables from that window to be then multiplied by the actual values of each observation in the last year of the window....

            is -rangestat- able to handle this? or perhaps you can suggest another way to solve this issue?

            Attached Files

            Comment


            • #36
              I guess I never picked-up that you wanted cross-sectional regressions. You can do it with rangestat but it's probably simpler just to loop over years. Here's my guess as to what you want using both methods:

              Code:
              * define a linear regression in Mata using quadcross() - help mata cross(), example 2
              mata:
              mata clear
              mata set matastrict on
              real rowvector myreg(real matrix Xall)
              {
                  real colvector y, b, Xy
                  real matrix X, XX
              
                  y    = Xall[.,1]
                  X     = Xall[.,2::cols(Xall)]
                  
                  XX = quadcross(X, X)
                  Xy = quadcross(X, y)
                  b  = invsym(XX) * Xy
              
                  return(b',rows(X))
              }
              end
              
              * -------------------- using rangestat ---------------------------
              use "forstatalist.dta", clear
              isid yr permno, sort
              by yr: gen yr1 = _n == 1
              gen low = cond(yr1 & yr >= 1995, yr-4, 0)
              gen high = cond(yr1 & yr >= 1995, yr, 0)
              list yr low high if yr1
              
              gen double constant = 1
              rangestat (myreg) annual_ret A B C D E constant, interval(yr low high) casewise
              
              gen double fitted = 0
              local i 0
              foreach v in A B C D E constant {
                  by yr: replace fitted = fitted + `v' * myreg`++i'[1]
              }
              gen double resid = annual_ret - fitted
              
              
              * -------------------- looping over years ------------------------
              gen double resid_loop = .
              forvalues y = 1995/2002 {
                  local low = `y' - 4
                  regress annual_ret A B C D E if inrange(yr, `low', `y')
                  predict e if yr == `y', resid
                  replace resid_loop = e if yr == `y'
                  drop e
              }
              
              
              gen xdif = abs(resid - resid_loop)
              sum xdif

              Comment


              • #37
                And here's the version that includes all years up to the current year:

                Code:
                * define a linear regression in Mata using quadcross() - help mata cross(), example 2
                mata:
                mata clear
                mata set matastrict on
                real rowvector myreg(real matrix Xall)
                {
                    real colvector y, b, Xy
                    real matrix X, XX
                
                    y    = Xall[.,1]
                    X     = Xall[.,2::cols(Xall)]
                    
                    XX = quadcross(X, X)
                    Xy = quadcross(X, y)
                    b  = invsym(XX) * Xy
                
                    return(b',rows(X))
                }
                end
                
                * -------------------- using rangestat ---------------------------
                use "forstatalist.dta", clear
                isid yr permno, sort
                by yr: gen yr1 = _n == 1
                gen low = cond(yr1, 1991, 0)
                gen high = cond(yr1, yr, 0)
                list yr low high if yr1
                
                gen double constant = 1
                rangestat (myreg) annual_ret A B C D E constant, interval(yr low high) casewise
                
                gen double fitted = 0
                local i 0
                foreach v in A B C D E constant {
                    by yr: replace fitted = fitted + `v' * myreg`++i'[1]
                }
                gen double resid = annual_ret - fitted
                
                
                * -------------------- looping over years ------------------------
                gen double resid_loop = .
                forvalues y = 1993/2002 {
                    regress annual_ret A B C D E if inrange(yr, 1991, `y')
                    predict e if yr == `y', resid
                    replace resid_loop = e if yr == `y'
                    drop e
                }
                
                
                gen xdif = abs(resid - resid_loop)
                sum xdif

                Comment


                • #38
                  To sort out things:
                  Posts #18 and #19 do a "time series regression" of each observations and estimate the fitted value and residuals for the observation, and then move over time.
                  Posts #36 and #37 do a "cross section regression" and estimate the fitted and residual values for the observations in the last regression, and then move over time.

                  Are these notes correct?

                  Comment


                  • #39
                    Mike, you are responsible for studying the examples and determining if the code does what you intend. I can't help with the nomenclature, I just do data management.

                    Comment


                    • #40
                      Thanks Robert/and all participants.
                      Regardless of terminologies, my limited understanding from code #18 and #19 is that they regress the dependent variable "annual_ret" of a given company in a given year on its independent variables and estimate the fitted and residual values for this company in this year. This is done for each company. Then, the process is repeated by: either dropping the first year and adding an additional year (in post 18) or adding an additional year without dropping any earlier year (in post #19).

                      Is this right?

                      Comment


                      • #41
                        Mike: The strategy here is backwards. Once you have code that appears to do a lot of analyses repeatedly, the test of your understanding is that you can replicate sample results for a few cases by (in this instance) independent regressions on the data concerned.

                        Comment


                        • #42
                          Thanks.
                          I think Robert's post #25 is actually checking his post #18/ #19. It seems to be doing what I have explained in #40.

                          However, I believe that confirming my statement in #40 will really be very helpful so I feel comfort in using the code in the paper and acknowledging his main contribution (and others).

                          Perhaps Robert or another participant can assure this...Hopefully ! This will be so much appreciated, please!

                          Comment


                          • #43
                            can anyone help with my last post please
                            Thank You

                            Comment


                            • #44
                              Mike: Please note suggested practice on bumping threads at http://www.statalist.org/forums/help#adviceextras

                              Comment

                              Working...
                              X