Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Responding to post #13, here's an something that may help. We'll extract a variable number from the end of the depvar string, converting it to a number so it sorts in numeric order rather than string order. Add these two commands following the reshape.
    Code:
    generate varnum = real(substr(depvar,5,.))
    sort varnum

    Comment


    • #17
      Thank you very much William!

      Comment


      • #18
        an anyone explain me an easier way to sort funds (depvar) in ascending order like I explained in #13?
        Code:
        order fund*, sequential

        Comment


        • #19
          Inigo, I'm not sure you appreciate the advantage of what I'm proposing. If you run the example in #14 on 120 funds, you will see that it takes less than 1 second to finish and the data are exactly like you requested, no reshape, nothing else to do. Are you sure you want to continue with rolling?

          Code:
          * set up fake data with 120 fund variables
          clear all
          set seed 32154231
          set maxvar 10000
          set obs 168
          gen ym = ym(1999,12) + _n
          format %tm ym
          
          gen market = 100 + runiform() * _n
          gen small = runiform()
          gen high = runiform() + small
          gen momentum = runiform()
          
          forvalues i = 1/120 {
              local base = 100 * runiform()
              gen fund`i' = `base' + runiform() * _n
          }
          
          * define a linear regression using quadcross() - help mata cross(), example 2
          mata:
          mata set matastrict on
          
          real rowvector myreg(real matrix Xall)
          {
              real colvector y, b, Xy
              real matrix X, XX
          
              y    = Xall[.,1]
              X     = Xall[.,2::cols(Xall)]
              
              XX = quadcross(X, X)
              Xy = quadcross(X, y)
              b  = invsym(XX) * Xy
          
               return(rows(X), b')
          
          }
          
          end
          
          * add a constant
          gen double one = 1
          
          forvalues i = 1/120 {
              rangestat (myreg) fund`i' market small high momentum one, interval(ym -35 0) casewise
              rename myreg6 alpha`i'
              qui replace alpha`i' = . if myreg1 < 36
              drop myreg*
          }
          
          * spot check a few cases
          regress fund1 market small high momentum if inrange(ym,ym[50]-35,ym[50])
          list alpha1 in 50
          
          regress fund2 market small high momentum if inrange(ym,ym[55]-35,ym[55])
          list alpha2 in 55

          Comment


          • #20
            @RRobert Picard can rangestat be used to make the Fama and McBeth regression efficient in terms of time. The user written command xtfmb (SSC) estimates the Fama and McBetth regression, but in a rolling window it is too slow. I have over 3 million observation, and applying xtfmb in a rolling window of 120 months, it takes more than 12 hours. Can you suggest a way how to use rangestat to use xtfmb.

            Comment


            • #21
              Saeed Sardar: To apply rangestat (SSC), you would have to write your own Mata function to do what xtfmb does.

              Comment


              • #22
                Nick Cox, I do not know how to do that. Since Robert already showed to do that for simple regression, I hope that can do it for xtfmb as well.

                Comment


                • #23
                  Well this is well outside my comfort zone but a quick look at "xtfmb.ado" suggests that, at least for simple cases (no weights, no lags), you can perform the first step of the Fama and MacBeth (1973) procedure using rangestat and implement the second step from the results generated by rangestat.

                  The first step of Fama and MacBeth (1973) is to perform cross-sectional regressions, one per time period. In the example below, the interval bounds are defined such that only one observation per time period will trigger a regression. The other observations have interval bounds that will never match any observation and therefore never trigger a regression for that observation. Since the year variable in the grunfeld dataset is of type float, I use c(minfloat) for both lower and upper interval bounds for observations that should not trigger a regression.

                  The myreg() Mata function below performs a linear regression with intercept. When a Mata function is called by rangestat, the results are stored in variables named after the function name and a sequence number. In the example below, the first variable returned is called myreg1 and stores the number of observations for the regression. The next variable is myreg2 and contains the r-squared value of each regression. The myreg3 myreg4 myreg5 variables contain the regression coefficients (intercept is last).

                  There's nothing special about the Mata code below, just copy the complete example to a do-file and run it as is. The last 2 lines replicates the results using xtfmb.

                  Code:
                  * define a linear regression with intercept in Mata
                  mata:
                  mata clear
                  mata set matastrict on
                  real rowvector myreg(real matrix Xall)
                  {
                      real colvector y, b, Xy
                      real matrix X, XX, R
                      real scalar ymean, tss, mss, r2
                  
                      y = Xall[.,1]                // dependent var is first column of Xall
                      X = Xall[.,2::cols(Xall)]    // the remaining cols are the independent variables
                      X = X,J(rows(X),1,1)        // add a constant
                      
                      XX = quadcross(X, X)        // linear regression, see help mata cross(), example 2
                      Xy = quadcross(X, y)
                      b  = invsym(XX) * Xy
                      
                      ymean = mean(y)
                      tss = sum((y :- ymean) :^ 2)        // total sum of squares
                      mss = sum( (X * b :- ymean)  :^ 2)  // model sum of squares    
                      r2 = mss / tss
                  
                      return(rows(X), r2, b')
                  }
                  end
                  
                  webuse grunfeld, clear
                  
                  * ---- first step, cross-sectional regression, only do one per time period ---
                  bysort year (company): gen year1 = _n == 1
                  gen low = cond(year1, year, c(minfloat))
                  gen high = low
                  rangestat (myreg) invest mvalue kstock, interval(year low high) casewise
                  * -myreg- returns, in order, N r2 coefficients, the intercept is last
                  rename (myreg1 myreg2 myreg3 myreg4 myreg5) (nobs r2 b_mvalue b_kstock b_constant)
                  keep if year1
                  
                  * ---- second step, estimate final coefficients ---
                  gen one = 1
                  mvreg b_mvalue b_kstock b_constant = one, noconstant
                  
                  qui sum r2, meanonly
                  dis "avg. R-squared = " r(mean)
                  
                  
                  * replicate using xtfmb (from SSC)
                  webuse grunfeld, clear
                  xtfmb invest mvalue kstock
                  Last edited by Robert Picard; 16 Jun 2016, 15:52.

                  Comment


                  • #24
                    Robert Picard, thanks for spending time on it and creating a wonderful code. Since your contribution will be known and acknowleged for long, and many researchers will use it, you can further work on the code and perhaps give it a shape in form of a program. This will increase the usage of the code and will help many researchers to benefit from it. You might also consider adding the lags option.

                    Comment

                    Working...
                    X