Dear Stata users,
I am trying to run an cross-sectional regression on firms in the bottom 30% and top30% of the distribution of book-to-market value of panel data.
I tried to rank firms every year, but I can't identify the top/ bottom 30% of them, because this an unbalanced panel , and each year's total number of firms is different.
I would be grateful if someone could help me to identify these firms each year.
here's the code i use
Here's part of my data
Thank you for your help in advance!
I am trying to run an cross-sectional regression on firms in the bottom 30% and top30% of the distribution of book-to-market value of panel data.
I tried to rank firms every year, but I can't identify the top/ bottom 30% of them, because this an unbalanced panel , and each year's total number of firms is different.
I would be grateful if someone could help me to identify these firms each year.
here's the code i use
Code:
sort gvkey year local i=1964 // the time period is 1964-2014 while `i'<=2014{ quietly egen per70`i'=pctile(btm), p(70) //btm is the book-to-market value, and I have to find out the firms with top/bottom 30% of the distribution of btm quietly drop if btm<70`i' quietly drop per70`i' local i=`i'+1 }
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long gvkey double year float btm 2403 1964 .16921677 9103 1964 .16258383 1608 1964 .18209743 10060 1964 .18040167 1481 1964 .14549348 4780 1964 .1590813 3874 1964 .1805083 3235 1964 .1769771 11535 1964 .17150614 4475 1964 .1712105 4453 1964 .12189302 4021 1964 .18900825 11264 1964 .17115825 10878 1964 .19931643 6502 1964 .18697643 8645 1964 .16715898 6113 1964 .16654503 3489 1964 .1921034 11280 1964 .16220094 9616 1964 .18040165 end
Comment