More efficient way to run vecrank on firm-level rolling windows

Noor Alshamma

Join Date: Feb 2022

Posts: 2
#1

More efficient way to run vecrank on firm-level rolling windows

23 Feb 2022, 03:59

Hi,

I'm working on panel data and want to run vecrank on firm-level with rolling windows, where I want to store some postestimation results as new variables. My data has the following format:

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(firmid timeid logMV logVF) 1 218 4.1860685 4.30219 1 219 4.19654 4.5676622 1 220 4.522541 4.5608683 1 221 4.4920044 4.56502 1 222 4.4122176 4.57979 1 223 4.3952684 4.5846977 1 224 4.3426247 4.598754 1 225 4.258428 4.6033964 1 226 4.378387 4.6064277 1 227 4.351954 4.611788 end

I'm currently using the following code:

Code:

bysort firmid: gen N=_n gen rank5=. gen obs_johns=. levelsof firmid, local (firmids) foreach i in `firmids' { sum N if firmid==`i' local n=r(min)+119 local p=r(max) forvalues j=`n'(6)`p'{ vecrank logMV logVF if firmid==`i' & N<=`j' & N>`j'-120, lags(1) replace rank5= e(k_ce95) if firmid==`i' & N==`j' replace obs_johns=e(N) if firmid==`i' & N==`j' } }

The code works perfectly, but my problem is that it takes a very long time to run. I'm aware that foreach and forvalue loops tend to take a long time. I came across rangestat which is much faster but it does not support vecrank command, so I would need to use mata to write the command myself and I have zero experience with mata.

Do you have any advice or suggestions?

Thanks,
Noor

Last edited by Noor Alshamma; 23 Feb 2022, 04:01.
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

23 Feb 2022, 10:30

Perhaps the community-contributed rangerun command, written by Robert Picard and Nick Cox, two of the authors of rangestat, and also available from SSC, will accomplish what you want.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#3

23 Feb 2022, 11:04

Thanks for the mention in #2 but my guess is that almost all problems here arise from using vecrank and using it many times. But just possibly rangerun or rolling would reduce the bookkeeping a little.

Since vecrank results are being displayed one by one, there should be some handle on how fast they are appearing.
1 like
Comment
Noor Alshamma

Join Date: Feb 2022

Posts: 2
#4

24 Feb 2022, 04:05

Thanks very much for suggesting rangerun. It is orders of magnitude faster than using loops.

A follow up question: Is it possible to choose a step of 6 instead of 1 between intervals? I could not find guidance on this in rangerun help file.

This is the code I'm currently using, which does everything as in the loop commands except with steps of 1 between intervals.

Code:

program define jhnsn_notrend if _N<120 exit tsset firmid timeid vecrank logMV logVF, lags(1) gen rank5_rr= e(k_ce95) gen obs_jhnsn_rr=e(N) end rangerun jhnsn_notrend, by(firmid) interval(timeid -120 0) use(logMV logVF firmid timeid) verbose

Last edited by Noor Alshamma; 24 Feb 2022, 04:36.
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

24 Feb 2022, 07:28

There is no dedicated option on rangerun to step through your observations in a different increment.

Instead, the more general principle for this is to use variables for your interval low and high values, and ensure that there are no observations in the interval they define except for those observations for which you want to run the command.

Here's a simple example that I hope will start you on a useful path with your actual code. I have data from 1951 to 2022 and want to use 1956 (the first year with 5 lagged observations), 1962, ... .

Code:

// create example data starting in 1951
local firstyear 1951
set obs 72
generate timeid = `firstyear'-1+_n
generate value = _n

// program to find mean of value within range
capture program drop demo
program define demo
    egen big = mean(value)
end

generate low  = cond(mod(timeid-`firstyear',6)==5,timeid-5,-1)
generate high = cond(mod(timeid-`firstyear',6)==5,timeid,  -1)
rangerun demo, interval(timeid low high) use(value) verbose
list in 1/12, clean
list if big!=., clean

Code:

. list in 1/12, clean

       timeid   value    low   high   big  
  1.     1951       1     -1     -1     .  
  2.     1952       2     -1     -1     .  
  3.     1953       3     -1     -1     .  
  4.     1954       4     -1     -1     .  
  5.     1955       5     -1     -1     .  
  6.     1956       6   1951   1956   3.5  
  7.     1957       7     -1     -1     .  
  8.     1958       8     -1     -1     .  
  9.     1959       9     -1     -1     .  
 10.     1960      10     -1     -1     .  
 11.     1961      11     -1     -1     .  
 12.     1962      12   1957   1962   9.5  

. list if big!=., clean

       timeid   value    low   high    big  
  6.     1956       6   1951   1956    3.5  
 12.     1962      12   1957   1962    9.5  
 18.     1968      18   1963   1968   15.5  
 24.     1974      24   1969   1974   21.5  
 30.     1980      30   1975   1980   27.5  
 36.     1986      36   1981   1986   33.5  
 42.     1992      42   1987   1992   39.5  
 48.     1998      48   1993   1998   45.5  
 54.     2004      54   1999   2004   51.5  
 60.     2010      60   2005   2010   57.5  
 66.     2016      66   2011   2016   63.5  
 72.     2022      72   2017   2022   69.5  

.

Added in edit: I forgot to mention that I think the code in post #4 is using a 121-observation range while the code in post #1 using a 120-observation range.

Last edited by William Lisowski; 24 Feb 2022, 07:47.

Comment

Sarah Magd

Join Date: Feb 2022

Posts: 60
#6

21 Jan 2023, 10:27

Dear Mr. Nick Cox,

I want to run the Johansen cointegration test for rolling-estimation windows. I have two variables: GDP and oil. I have tried the following command, but it gave me nothing.

Code:

rolling m =r(max statistic) s=r(critical value), window(100) clear : vecrank GDP oil, lags(5)

Could you please help with this?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#7

22 Jan 2023, 05:41

Sarah Magd I don't know much more about vecrank than I wrote in #2, and indeed I have never used it. But I glanced at the help, which underlines that it is an e-class command with documented e-class results.

I don't know where the code in #6 comes from, unless an optimistic guess. As a detail, know that r-class results would never have names that included spaces.

#4 is the pattern to follow. Pick up e-class results each time vecrank is executed.
Comment
Sarah Magd

Join Date: Feb 2022

Posts: 60
#8

22 Jan 2023, 06:27

Dear Mr. Nick Cox
Thanks a lot for your quick reply.

The code in #4 is for a panel dataset. Since I have a time series dataset, I tried to change the code as below, but it gives me errors.

Code:

Gen t = _n Gen N = _n program define jhnsn_notrend if _N<120 exit tsset Date vecrank gdp income, lags(1) gen rank5_rr= e(k_ce95) gen obs_jhnsn_rr=e(N) end rangerun jhnsn_notrend, interval(t -120 0) use( gdp income t) verbose

This gives me an error:
time variable not set, use tsset varname ...

Could you please help me with this?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#9

23 Jan 2023, 03:11

Pinging me because I commented earlier is only going to disappoint. Sorry, but as already said I don't use vecrank and a quick glance at this doesn't give me any ideas on what is wrong.
Comment

Announcement