Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • More efficient way to run vecrank on firm-level rolling windows

    Hi,

    I'm working on panel data and want to run vecrank on firm-level with rolling windows, where I want to store some postestimation results as new variables. My data has the following format:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(firmid timeid logMV logVF)
    1 218 4.1860685   4.30219
    1 219   4.19654 4.5676622
    1 220  4.522541 4.5608683
    1 221 4.4920044   4.56502
    1 222 4.4122176   4.57979
    1 223 4.3952684 4.5846977
    1 224 4.3426247  4.598754
    1 225  4.258428 4.6033964
    1 226  4.378387 4.6064277
    1 227  4.351954  4.611788
    end
    I'm currently using the following code:
    Code:
    bysort firmid: gen N=_n
    
    gen rank5=.
    gen obs_johns=.
    
    levelsof firmid, local (firmids)
    foreach i in `firmids' {
        sum N if firmid==`i'
        local n=r(min)+119
        local p=r(max)
        forvalues j=`n'(6)`p'{
            vecrank logMV logVF if firmid==`i' & N<=`j' & N>`j'-120, lags(1)
            replace rank5= e(k_ce95) if firmid==`i' & N==`j'
            replace obs_johns=e(N) if firmid==`i' & N==`j'
        }
    }
    The code works perfectly, but my problem is that it takes a very long time to run. I'm aware that foreach and forvalue loops tend to take a long time. I came across rangestat which is much faster but it does not support vecrank command, so I would need to use mata to write the command myself and I have zero experience with mata.

    Do you have any advice or suggestions?

    Thanks,
    Noor




    Last edited by Noor Alshamma; 23 Feb 2022, 04:01.

  • #2
    Perhaps the community-contributed rangerun command, written by Robert Picard and Nick Cox, two of the authors of rangestat, and also available from SSC, will accomplish what you want.

    Comment


    • #3
      Thanks for the mention in #2 but my guess is that almost all problems here arise from using vecrank and using it many times. But just possibly rangerun or rolling would reduce the bookkeeping a little.

      Since vecrank results are being displayed one by one, there should be some handle on how fast they are appearing.

      Comment


      • #4
        Thanks very much for suggesting rangerun. It is orders of magnitude faster than using loops.

        A follow up question: Is it possible to choose a step of 6 instead of 1 between intervals? I could not find guidance on this in rangerun help file.

        This is the code I'm currently using, which does everything as in the loop commands except with steps of 1 between intervals.

        Code:
        program define jhnsn_notrend
            if _N<120 exit
            tsset firmid timeid
            vecrank logMV logVF, lags(1)
            gen rank5_rr= e(k_ce95)
            gen obs_jhnsn_rr=e(N)
        end
        
        rangerun jhnsn_notrend, by(firmid) interval(timeid -120 0) use(logMV logVF firmid timeid) verbose
        Last edited by Noor Alshamma; 24 Feb 2022, 04:36.

        Comment


        • #5
          There is no dedicated option on rangerun to step through your observations in a different increment.

          Instead, the more general principle for this is to use variables for your interval low and high values, and ensure that there are no observations in the interval they define except for those observations for which you want to run the command.

          Here's a simple example that I hope will start you on a useful path with your actual code. I have data from 1951 to 2022 and want to use 1956 (the first year with 5 lagged observations), 1962, ... .
          Code:
          // create example data starting in 1951
          local firstyear 1951
          set obs 72
          generate timeid = `firstyear'-1+_n
          generate value = _n
          
          // program to find mean of value within range
          capture program drop demo
          program define demo
              egen big = mean(value)
          end
          
          generate low  = cond(mod(timeid-`firstyear',6)==5,timeid-5,-1)
          generate high = cond(mod(timeid-`firstyear',6)==5,timeid,  -1)
          rangerun demo, interval(timeid low high) use(value) verbose
          list in 1/12, clean
          list if big!=., clean
          Code:
          . list in 1/12, clean
          
                 timeid   value    low   high   big  
            1.     1951       1     -1     -1     .  
            2.     1952       2     -1     -1     .  
            3.     1953       3     -1     -1     .  
            4.     1954       4     -1     -1     .  
            5.     1955       5     -1     -1     .  
            6.     1956       6   1951   1956   3.5  
            7.     1957       7     -1     -1     .  
            8.     1958       8     -1     -1     .  
            9.     1959       9     -1     -1     .  
           10.     1960      10     -1     -1     .  
           11.     1961      11     -1     -1     .  
           12.     1962      12   1957   1962   9.5  
          
          . list if big!=., clean
          
                 timeid   value    low   high    big  
            6.     1956       6   1951   1956    3.5  
           12.     1962      12   1957   1962    9.5  
           18.     1968      18   1963   1968   15.5  
           24.     1974      24   1969   1974   21.5  
           30.     1980      30   1975   1980   27.5  
           36.     1986      36   1981   1986   33.5  
           42.     1992      42   1987   1992   39.5  
           48.     1998      48   1993   1998   45.5  
           54.     2004      54   1999   2004   51.5  
           60.     2010      60   2005   2010   57.5  
           66.     2016      66   2011   2016   63.5  
           72.     2022      72   2017   2022   69.5  
          
          .
          Added in edit: I forgot to mention that I think the code in post #4 is using a 121-observation range while the code in post #1 using a 120-observation range.


          Last edited by William Lisowski; 24 Feb 2022, 07:47.

          Comment


          • #6
            Dear Mr. Nick Cox,

            I want to run the Johansen cointegration test for rolling-estimation windows. I have two variables: GDP and oil. I have tried the following command, but it gave me nothing.
            Code:
            rolling m =r(max statistic) s=r(critical value), window(100) clear : vecrank GDP oil, lags(5)
            Could you please help with this?

            Comment


            • #7
              Sarah Magd I don't know much more about vecrank than I wrote in #2, and indeed I have never used it. But I glanced at the help, which underlines that it is an e-class command with documented e-class results.

              I don't know where the code in #6 comes from, unless an optimistic guess. As a detail, know that r-class results would never have names that included spaces.

              #4 is the pattern to follow. Pick up e-class results each time vecrank is executed.

              Comment


              • #8
                Dear Mr. Nick Cox
                Thanks a lot for your quick reply.

                The code in #4 is for a panel dataset. Since I have a time series dataset, I tried to change the code as below, but it gives me errors.

                Code:
                Gen t = _n
                Gen N = _n
                program define jhnsn_notrend
                if _N<120 exit
                tsset Date
                vecrank gdp income, lags(1)  
                gen rank5_rr= e(k_ce95)  
                gen obs_jhnsn_rr=e(N) end  rangerun jhnsn_notrend, interval(t -120 0) use( gdp income t) verbose
                This gives me an error:
                time variable not set, use tsset varname ...


                Could you please help me with this?

                Comment


                • #9
                  Pinging me because I commented earlier is only going to disappoint. Sorry, but as already said I don't use vecrank and a quick glance at this doesn't give me any ideas on what is wrong.

                  Comment

                  Working...
                  X