Dear reader,
I have large panel dataset on a few thousand firms' stock returns and share issuance. I'd like to compare the stock returns of firms that issue a lot of shares, versus those who don't.
In order to so, I've sorted my data by my variable 'netissue' and created decile rankings. For each year between 1980 and 2018 my variable netissue_decile thus gives a rank between 1-10 denouning if that specific firm has issued few or many stocks in the relevant year.
Thus, I want to compare the average return for the next 5 periods of a portfolio consisting of stocks where netissue_decile == 1 versus one consisting of only firms in the tenth decile.
A major complication is that it's possible for a firm to be in the first decile in e.g. 1990 and then again in 2010. When asking myself the question, what is the mean return after a firm was in the first decile of share issuers three years after? Both the return from 1993 and 2013 need to be included.
What I've tried to do so far is use bysorts and loops, but to no avail. My best bet yet was to create a variable called years_since that measures the number of years after a firm was in the first or tenth decile. Then, my plan was to take the mean of all returns for years_since 1 till 5. I tried to do this:
But unfortunately, STATA throws the error ''1+1' invalid observation number'. So it doesn't recognize 'i + 1' as a valid observation number.
Secondly, I think the approach with a loop will ignore that the [i + 4] observation might actually belong to a different ID within the panel.
Does anyone know of a good way to tackle this problem? A code example would of course be nice, but is not necessary at all. If I can just find a good strategy to use.
I have large panel dataset on a few thousand firms' stock returns and share issuance. I'd like to compare the stock returns of firms that issue a lot of shares, versus those who don't.
In order to so, I've sorted my data by my variable 'netissue' and created decile rankings. For each year between 1980 and 2018 my variable netissue_decile thus gives a rank between 1-10 denouning if that specific firm has issued few or many stocks in the relevant year.
Thus, I want to compare the average return for the next 5 periods of a portfolio consisting of stocks where netissue_decile == 1 versus one consisting of only firms in the tenth decile.
A major complication is that it's possible for a firm to be in the first decile in e.g. 1990 and then again in 2010. When asking myself the question, what is the mean return after a firm was in the first decile of share issuers three years after? Both the return from 1993 and 2013 need to be included.
What I've tried to do so far is use bysorts and loops, but to no avail. My best bet yet was to create a variable called years_since that measures the number of years after a firm was in the first or tenth decile. Then, my plan was to take the mean of all returns for years_since 1 till 5. I tried to do this:
Code:
generate years_since = . local i = 1 local N = _N while `i' <`N' { if netissue_annual_decile == 1 | netissue_annual_decile == 10 { replace years_since = 1 replace years_since = 2 in `i'+1 replace years_since = 3 in `i'+2 replace years_since = 4 in `i'+3 replace years_since = 5 in `i'+4 } `i' = `i' + 1 }
Secondly, I think the approach with a loop will ignore that the [i + 4] observation might actually belong to a different ID within the panel.
Does anyone know of a good way to tackle this problem? A code example would of course be nice, but is not necessary at all. If I can just find a good strategy to use.
Comment