Create new variable - help with loop

Parvesh Seeballack

Join Date: May 2017
Posts: 41

Create new variable - help with loop

19 Feb 2022, 05:07

Hi everyone

Below is an example of my dataset:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float date long(vol oi) float(wd mon) long T float Drop60
13179    0 2023 .293226  .872614 1 1
13179    0 1091 .267362 .8883368 1 1
13179  201 3432 .250134 .8961982 1 1
13179  158 1594 .254443 .9040596 1 1
13179   75 7687 .236026  .911921 1 1
13179  619 5676 .225129 .9197823 1 1
13179 7112 5645 .215459 .9276438 1 1
13179 3545 5579 .200987 .9355052 1 1
13179 4485 3735 .191604 .9433665 1 1
13179   40 7685  .24853 .8254457 1 1
13179    0 1292 .225927 .8333071 1 1
13179    0   66 .215733 .8411685 1 1
13179  500 9800 .205591 .8490299 1 1
13179  150 1826 .199856 .8647527 1 1
13179    0 1554 .185173  .872614 1 1
13179    0 5542 .182162 .8804754 1 1
13179  800 5379 .182713 .8883368 1 1
13179  527  794 .182141 .8961982 1 1
13179  340 9595  .18201 .9040596 1 1
13179  372 6825 .175178  .911921 1 1
13179   21 2388  .16498 .9197823 1 1
13179  603  527 .158701 .9276438 1 1
13179  258 5211 .152936 .9355052 1 1
13179  991 9803 .147164 .9433665 1 1
13179   17  684 .167134 .8647527 2 .
13179 1249 2535 .148034 .9040596 2 .
13179   15 9344 .154372  .911921 2 .
13179  146 8999 .143395 .9276438 2 .
13179   80 3754 .135121 .9433665 2 .
13179    0 4270 .171407 .8254457 2 .
13179   17 2678 .154186 .8647527 2 .
13179    0 7022 .152285 .8804754 2 .
13179    0 8496 .145116 .8883368 2 .
13179  100 5630 .142764 .8961982 2 .
13179   25 6954 .138713 .9040596 2 .
13179   13 1072 .135937  .911921 2 .
13179   12 3845 .134019 .9197823 2 .
13179   57 8082   .1312 .9276438 2 .
13179  515 1967 .127598 .9355052 2 .
13179  838 7632 .128123 .9433665 2 .
13179    0  931 .162546 .8175843 3 .
13179    5 1651 .160583 .8254457 3 .
13179    0  584  .14889 .8647527 3 .
13179    0 2328 .145014 .8804754 3 .
13179    5 3479 .138267 .9040596 3 .
13179    1 1944 .136023  .911921 3 .
13179    0 1856 .133794 .9197823 3 .
13179 1107 4192 .132004 .9276438 3 .
13179  300  335 .129636 .9355052 3 .
13179 1224 9701 .126738 .9433665 3 .
13179   36 8304 .140659 .9040596 3 .
13179  237 5980 .133874 .9433665 3 .
13179    0 1547 .158981 .8254457 4 .
13179    0 3408 .151112 .8647527 4 .
13179    1 1818 .142769 .9040596 4 .
13179    4 3151 .134264 .9433665 4 .
13179    0   21 .159274 .8254457 4 .
13179    0  784 .158516 .8647527 4 .
13179    0    1 .150619 .9040596 4 .
13179    5  228 .142159 .9433665 4 .
end
format %td date
label values T m
label def m 1 "M1", modify
label def m 2 "M2", modify
label def m 3 "M3", modify
label def m 4 "M4", modify

My goal is to keep ONE wd observation per date in my dataset, based on the following filters:

(1) highest oi, and to make sure that I end up with only 1 wd observation on each date I add filter (2)
(2) keep observation whose mon is closest to 0.95

My code to apply (1) and (2) are as follows:

Code:

// Main Filter: Keep Highest Open Interest
sort date
by date: egen max_oi = max(oi)
keep if oi == max_oi

// Additional Filter To Keep Single Observation Per Date
// Choose Contract Closest to 0.95
gen diff = abs(mon - 0.95)
sort date
by date: egen min_diff = min(diff)
keep if diff==min_diff

keep date wd

I have a variable T that actually seperates my data into different sub-samples. As you will note there are four sub-samples, i.e. M1-M4. I also have an additional sub-sample which is Drop60.

My question is how do I iterate over ther different sub-samples to end up with one single observation of wd, after applying the filters in (1) and (2).

The end goal would be columns that record the wd values in the different m groups (M1_wd, M2_wd..., M4_wd) and an additional column for Drop60_wd, after applying the above filters.

I would highly appreciate your help with the above.

I am worried if I duplicate my files to re-work with different m sub-samples I might end up with making mistakes in my coding. A loop would help me a lot minimise the risk of making mistakes.

Thanks you in advance.

Regards
Parvesh

Tags: None

William Lisowski

Join Date: Dec 2014
Posts: 10150

19 Feb 2022, 09:59

I don't totally understand what you want, but I think this example may start you in a useful direction.

Code:

// Main Filter: Keep Highest Open Interest
sort T date
by T date: egen max_oi = max(oi)
keep if oi == max_oi

// Additional Filter To Keep Single Observation Per Date
// Choose Contract Closest to 0.95
gen diff = abs(mon - 0.95)
sort T date
by T date: egen min_diff = min(diff)

forvalues t=1/4 {
    generate M`t'_wd = T==`t' & diff==min_diff 
}

list T date wd M*_wd

Code:

. list T date wd M*_wd

     +----------------------------------------------------------+
     |  T        date        wd   M1_wd   M2_wd   M3_wd   M4_wd |
     |----------------------------------------------------------|
  1. | M1   31jan1996   .147164       1       0       0       0 |
  2. | M2   31jan1996   .154372       0       1       0       0 |
  3. | M3   31jan1996   .126738       0       0       1       0 |
  4. | M4   31jan1996   .151112       0       0       0       1 |
     +----------------------------------------------------------+

Comment

Parvesh Seeballack

Join Date: May 2017

Posts: 41
#3

19 Feb 2022, 11:03

Dear William Lisowski

Thank you for replying to my query.

The example that you provided is exactly what I was looking for.

I knew that I had to use forvalues but didn't totally understand how to do so. With your code I finally have some insights on how to proceed with my analysis.

Thanks again.

Regards
Parvesh
Comment

Announcement

Create new variable - help with loop

Comment

Comment