Hi everybody,
this is the dataset that I have:
My aim is to sort stocks (identified by permno) into 10 portfolios based on the volatility of their residuals (monthidio). I have to repeat the sorting at the end of every month. I call that variable "monthidio" because I have already reduced the complete dataset to a smaller one that contains only the dates at the end of eevry month (as you can seee). Now, to sort stocks into 10 portfolios based on monthidio in every date I run:
egen voladecile= xtile(monthidio), by(date) nq(10)
which gives me the error message "too many values". This is very strange for me because that same command worked on a similar dataset which contained even more values for both dates and monthidio. Do you have an idea to why this happen and how to solve the issue?
I have already read https://www.stata.com/statalist/arch.../msg00365.html and https://www.statalist.org/forums/for...y-values-error but it doesn't seem to exactly fit the my case, because I would like to avoid loops and because I actually have missing values in my variable monthidio and thus what I have differs from these two cases.
Thank you in advance
this is the dataset that I have:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long permno float date double Return long numboftrades float(MeR RF MEt Exret) double firstbeta float(monthidio dummyendofmonth) 10000 9527 -.014085 . .0092 .00025 16100 -.014335 2.5143529763022494 .06413759 1 10000 9555 0 . .0019 .00028 11960 -.00028 .9500515549985641 .03131281 1 10000 9586 .007092 . .0006 .0003 16330 .006792 .975191955026165 .04484615 1 10000 9616 -.015385 . -.019 .00024 15172 -.015625 .5980838126958284 .013047387 1 10000 9646 .015306 . -.0013 .00023 11793.878 .015076 .2896041894082146 .03917417 1 10000 9677 .010204 . .0052 .00025 11734.594 .009954 .38846733224656743 .019607043 1 10000 9708 .096386 . -.0012 .00024 10786.344 .096146 .5768399748387934 .04623256 1 10000 9737 0 . .0004 .00022 4148.5938 -.00022 .3954379265009734 .096399 1 10000 9769 .015385 . .0065 .00021 3911.531 .015175 .5341150474454677 .04805862 1 10000 9800 0 . .0005 .0002 3002.344 -.0002 .6142194073333234 .04291996 1 10000 9828 0 . .0019 .00021 3182.504 -.00021 .5121709319162139 .03591325 1 10000 9861 0 . -.0037 .00022 1981.566 -.00022 .6205842421862775 .04912256 1 10000 9891 0 . -.0002 .0002 1581.5313 -.0002 .4051215011366245 .03017019 1 10000 9919 -.071429 . .0037 .00023 1581.5313 -.071659 .3677733912958665 .025082354 1 10000 9951 -.111111 . .0069 .00021 973.25 -.111321 .01944200731215832 .05910159 1 10000 9981 .153846 . .0112 .00021 912.4413 .153636 .10909586043253999 .04275477 1 10000 10010 0 . -.0004 .00019 851.5938 -.00019 -.20707865128488628 .01545156 1 10000 10024 . . .0086 .00022 . . . .0011192125 1 10001 9527 .010309 . .0092 .00025 6033.125 .010059 .807071924428063 .011264178 1 10001 9555 -.019608 . .0019 .00028 6156.25 -.019888 .272703714704386 .00933914 1 end format %td date
egen voladecile= xtile(monthidio), by(date) nq(10)
which gives me the error message "too many values". This is very strange for me because that same command worked on a similar dataset which contained even more values for both dates and monthidio. Do you have an idea to why this happen and how to solve the issue?
I have already read https://www.stata.com/statalist/arch.../msg00365.html and https://www.statalist.org/forums/for...y-values-error but it doesn't seem to exactly fit the my case, because I would like to avoid loops and because I actually have missing values in my variable monthidio and thus what I have differs from these two cases.
Thank you in advance
Comment