Measuring the average of variables of variables in subsequent periods

Jesse Tielens

Join Date: Jul 2018

Posts: 46
#1

Measuring the average of variables of variables in subsequent periods

14 Jul 2018, 11:10

Dear reader,

I have large panel dataset on a few thousand firms' stock returns and share issuance. I'd like to compare the stock returns of firms that issue a lot of shares, versus those who don't.
In order to so, I've sorted my data by my variable 'netissue' and created decile rankings. For each year between 1980 and 2018 my variable netissue_decile thus gives a rank between 1-10 denouning if that specific firm has issued few or many stocks in the relevant year.

Thus, I want to compare the average return for the next 5 periods of a portfolio consisting of stocks where netissue_decile == 1 versus one consisting of only firms in the tenth decile.
A major complication is that it's possible for a firm to be in the first decile in e.g. 1990 and then again in 2010. When asking myself the question, what is the mean return after a firm was in the first decile of share issuers three years after? Both the return from 1993 and 2013 need to be included.

What I've tried to do so far is use bysorts and loops, but to no avail. My best bet yet was to create a variable called years_since that measures the number of years after a firm was in the first or tenth decile. Then, my plan was to take the mean of all returns for years_since 1 till 5. I tried to do this:

Code:

generate years_since = . local i = 1 local N = _N while `i' <`N' { if netissue_annual_decile == 1 | netissue_annual_decile == 10 { replace years_since = 1 replace years_since = 2 in `i'+1 replace years_since = 3 in `i'+2 replace years_since = 4 in `i'+3 replace years_since = 5 in `i'+4 } `i' = `i' + 1 }

But unfortunately, STATA throws the error ''1+1' invalid observation number'. So it doesn't recognize 'i + 1' as a valid observation number.
Secondly, I think the approach with a loop will ignore that the [i + 4] observation might actually belong to a different ID within the panel.

Does anyone know of a good way to tackle this problem? A code example would of course be nice, but is not necessary at all. If I can just find a good strategy to use.

Last edited by Jesse Tielens; 14 Jul 2018, 11:12.
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

14 Jul 2018, 13:55

Welcome to Statalist.

It seems to me that the approach you need to take is, for each firm and each year, calculate the average return over the next 5 years. Then, select the combination of firms and years that you want to build your portfolios from. You talk about a firm being in the first decile in 2000, then falling out of the first decile for a decade before returning. What about a firm that is in the tenth decile for several years in a row?

The rangestat command - a user-written command available from SSC - will assist you in your calculations. In the output of the command search rangestat you'll see rangestat listed, click on the link to open a description with a link to click to install it.

For the reason you discuss, the approach you show in your post will not work. But to add to your Stata knowledge, the problem that Stata considered an error is due to the fact that after the value of `i' is substituted into the commands, your commands read like

Code:

replace years_since = 2 in 1+1

and "1+1" is not a number, it is, in this context, an expression that needs to be evaluated.

Code:

replace years_since = 2 in `=`i'+1'

would evaluate the expression. But again, don't go down that path.

If you need help using rangestat, post again, but this time do follow the guidance in the Statalist FAQ to provide a useful sample of the relevant data using the dataex command.
1 like
Comment
Jesse Tielens

Join Date: Jul 2018

Posts: 46
#3

14 Jul 2018, 14:35

Thanks for your extensive answer William. I've noticed it's mostly you and a few other seasoned members that answer a lot of questions in this forum, the help is definitely appreciated!

I will look into the use of rangestat and try to implement it in my do-file. As far as I can see from the help command, it does exactly what I'm looking for.

What about a firm that is in the tenth decile for several years in a row?

I was wondering about this issue myself as well. There are definitely occurences of firms being within the first or tenth decile several times within five years. But since the choice of if/how to include these years is more about research methodology rather than stata, I chose not to include this issue in my post.

If you need help using rangestat, post again, but this time do follow the guidance in the Statalist FAQ to provide a useful sample of the relevant data using the dataex command.

I will do so next time, as well as call it Stata not 'STATA'
Comment

Announcement

Measuring the average of variables of variables in subsequent periods

Comment

Comment