Hi everyone
Below is an example of my dataset:
My goal is to keep ONE wd observation per date in my dataset, based on the following filters:
(1) highest oi, and to make sure that I end up with only 1 wd observation on each date I add filter (2)
(2) keep observation whose mon is closest to 0.95
My code to apply (1) and (2) are as follows:
I have a variable T that actually seperates my data into different sub-samples. As you will note there are four sub-samples, i.e. M1-M4. I also have an additional sub-sample which is Drop60.
My question is how do I iterate over ther different sub-samples to end up with one single observation of wd, after applying the filters in (1) and (2).
The end goal would be columns that record the wd values in the different m groups (M1_wd, M2_wd..., M4_wd) and an additional column for Drop60_wd, after applying the above filters.
I would highly appreciate your help with the above.
I am worried if I duplicate my files to re-work with different m sub-samples I might end up with making mistakes in my coding. A loop would help me a lot minimise the risk of making mistakes.
Thanks you in advance.
Regards
Parvesh
Below is an example of my dataset:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float date long(vol oi) float(wd mon) long T float Drop60 13179 0 2023 .293226 .872614 1 1 13179 0 1091 .267362 .8883368 1 1 13179 201 3432 .250134 .8961982 1 1 13179 158 1594 .254443 .9040596 1 1 13179 75 7687 .236026 .911921 1 1 13179 619 5676 .225129 .9197823 1 1 13179 7112 5645 .215459 .9276438 1 1 13179 3545 5579 .200987 .9355052 1 1 13179 4485 3735 .191604 .9433665 1 1 13179 40 7685 .24853 .8254457 1 1 13179 0 1292 .225927 .8333071 1 1 13179 0 66 .215733 .8411685 1 1 13179 500 9800 .205591 .8490299 1 1 13179 150 1826 .199856 .8647527 1 1 13179 0 1554 .185173 .872614 1 1 13179 0 5542 .182162 .8804754 1 1 13179 800 5379 .182713 .8883368 1 1 13179 527 794 .182141 .8961982 1 1 13179 340 9595 .18201 .9040596 1 1 13179 372 6825 .175178 .911921 1 1 13179 21 2388 .16498 .9197823 1 1 13179 603 527 .158701 .9276438 1 1 13179 258 5211 .152936 .9355052 1 1 13179 991 9803 .147164 .9433665 1 1 13179 17 684 .167134 .8647527 2 . 13179 1249 2535 .148034 .9040596 2 . 13179 15 9344 .154372 .911921 2 . 13179 146 8999 .143395 .9276438 2 . 13179 80 3754 .135121 .9433665 2 . 13179 0 4270 .171407 .8254457 2 . 13179 17 2678 .154186 .8647527 2 . 13179 0 7022 .152285 .8804754 2 . 13179 0 8496 .145116 .8883368 2 . 13179 100 5630 .142764 .8961982 2 . 13179 25 6954 .138713 .9040596 2 . 13179 13 1072 .135937 .911921 2 . 13179 12 3845 .134019 .9197823 2 . 13179 57 8082 .1312 .9276438 2 . 13179 515 1967 .127598 .9355052 2 . 13179 838 7632 .128123 .9433665 2 . 13179 0 931 .162546 .8175843 3 . 13179 5 1651 .160583 .8254457 3 . 13179 0 584 .14889 .8647527 3 . 13179 0 2328 .145014 .8804754 3 . 13179 5 3479 .138267 .9040596 3 . 13179 1 1944 .136023 .911921 3 . 13179 0 1856 .133794 .9197823 3 . 13179 1107 4192 .132004 .9276438 3 . 13179 300 335 .129636 .9355052 3 . 13179 1224 9701 .126738 .9433665 3 . 13179 36 8304 .140659 .9040596 3 . 13179 237 5980 .133874 .9433665 3 . 13179 0 1547 .158981 .8254457 4 . 13179 0 3408 .151112 .8647527 4 . 13179 1 1818 .142769 .9040596 4 . 13179 4 3151 .134264 .9433665 4 . 13179 0 21 .159274 .8254457 4 . 13179 0 784 .158516 .8647527 4 . 13179 0 1 .150619 .9040596 4 . 13179 5 228 .142159 .9433665 4 . end format %td date label values T m label def m 1 "M1", modify label def m 2 "M2", modify label def m 3 "M3", modify label def m 4 "M4", modify
(1) highest oi, and to make sure that I end up with only 1 wd observation on each date I add filter (2)
(2) keep observation whose mon is closest to 0.95
My code to apply (1) and (2) are as follows:
Code:
// Main Filter: Keep Highest Open Interest sort date by date: egen max_oi = max(oi) keep if oi == max_oi // Additional Filter To Keep Single Observation Per Date // Choose Contract Closest to 0.95 gen diff = abs(mon - 0.95) sort date by date: egen min_diff = min(diff) keep if diff==min_diff keep date wd
My question is how do I iterate over ther different sub-samples to end up with one single observation of wd, after applying the filters in (1) and (2).
The end goal would be columns that record the wd values in the different m groups (M1_wd, M2_wd..., M4_wd) and an additional column for Drop60_wd, after applying the above filters.
I would highly appreciate your help with the above.
I am worried if I duplicate my files to re-work with different m sub-samples I might end up with making mistakes in my coding. A loop would help me a lot minimise the risk of making mistakes.
Thanks you in advance.
Regards
Parvesh
Comment