Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create new variable - help with loop

    Hi everyone

    Below is an example of my dataset:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float date long(vol oi) float(wd mon) long T float Drop60
    13179    0 2023 .293226  .872614 1 1
    13179    0 1091 .267362 .8883368 1 1
    13179  201 3432 .250134 .8961982 1 1
    13179  158 1594 .254443 .9040596 1 1
    13179   75 7687 .236026  .911921 1 1
    13179  619 5676 .225129 .9197823 1 1
    13179 7112 5645 .215459 .9276438 1 1
    13179 3545 5579 .200987 .9355052 1 1
    13179 4485 3735 .191604 .9433665 1 1
    13179   40 7685  .24853 .8254457 1 1
    13179    0 1292 .225927 .8333071 1 1
    13179    0   66 .215733 .8411685 1 1
    13179  500 9800 .205591 .8490299 1 1
    13179  150 1826 .199856 .8647527 1 1
    13179    0 1554 .185173  .872614 1 1
    13179    0 5542 .182162 .8804754 1 1
    13179  800 5379 .182713 .8883368 1 1
    13179  527  794 .182141 .8961982 1 1
    13179  340 9595  .18201 .9040596 1 1
    13179  372 6825 .175178  .911921 1 1
    13179   21 2388  .16498 .9197823 1 1
    13179  603  527 .158701 .9276438 1 1
    13179  258 5211 .152936 .9355052 1 1
    13179  991 9803 .147164 .9433665 1 1
    13179   17  684 .167134 .8647527 2 .
    13179 1249 2535 .148034 .9040596 2 .
    13179   15 9344 .154372  .911921 2 .
    13179  146 8999 .143395 .9276438 2 .
    13179   80 3754 .135121 .9433665 2 .
    13179    0 4270 .171407 .8254457 2 .
    13179   17 2678 .154186 .8647527 2 .
    13179    0 7022 .152285 .8804754 2 .
    13179    0 8496 .145116 .8883368 2 .
    13179  100 5630 .142764 .8961982 2 .
    13179   25 6954 .138713 .9040596 2 .
    13179   13 1072 .135937  .911921 2 .
    13179   12 3845 .134019 .9197823 2 .
    13179   57 8082   .1312 .9276438 2 .
    13179  515 1967 .127598 .9355052 2 .
    13179  838 7632 .128123 .9433665 2 .
    13179    0  931 .162546 .8175843 3 .
    13179    5 1651 .160583 .8254457 3 .
    13179    0  584  .14889 .8647527 3 .
    13179    0 2328 .145014 .8804754 3 .
    13179    5 3479 .138267 .9040596 3 .
    13179    1 1944 .136023  .911921 3 .
    13179    0 1856 .133794 .9197823 3 .
    13179 1107 4192 .132004 .9276438 3 .
    13179  300  335 .129636 .9355052 3 .
    13179 1224 9701 .126738 .9433665 3 .
    13179   36 8304 .140659 .9040596 3 .
    13179  237 5980 .133874 .9433665 3 .
    13179    0 1547 .158981 .8254457 4 .
    13179    0 3408 .151112 .8647527 4 .
    13179    1 1818 .142769 .9040596 4 .
    13179    4 3151 .134264 .9433665 4 .
    13179    0   21 .159274 .8254457 4 .
    13179    0  784 .158516 .8647527 4 .
    13179    0    1 .150619 .9040596 4 .
    13179    5  228 .142159 .9433665 4 .
    end
    format %td date
    label values T m
    label def m 1 "M1", modify
    label def m 2 "M2", modify
    label def m 3 "M3", modify
    label def m 4 "M4", modify
    My goal is to keep ONE wd observation per date in my dataset, based on the following filters:

    (1) highest oi, and to make sure that I end up with only 1 wd observation on each date I add filter (2)
    (2) keep observation whose mon is closest to 0.95

    My code to apply (1) and (2) are as follows:

    Code:
    // Main Filter: Keep Highest Open Interest
    sort date
    by date: egen max_oi = max(oi)
    keep if oi == max_oi
    
    // Additional Filter To Keep Single Observation Per Date
    // Choose Contract Closest to 0.95
    gen diff = abs(mon - 0.95)
    sort date
    by date: egen min_diff = min(diff)
    keep if diff==min_diff
    
    keep date wd
    I have a variable T that actually seperates my data into different sub-samples. As you will note there are four sub-samples, i.e. M1-M4. I also have an additional sub-sample which is Drop60.

    My question is how do I iterate over ther different sub-samples to end up with one single observation of wd, after applying the filters in (1) and (2).

    The end goal would be columns that record the wd values in the different m groups (M1_wd, M2_wd..., M4_wd) and an additional column for Drop60_wd, after applying the above filters.

    I would highly appreciate your help with the above.

    I am worried if I duplicate my files to re-work with different m sub-samples I might end up with making mistakes in my coding. A loop would help me a lot minimise the risk of making mistakes.

    Thanks you in advance.

    Regards
    Parvesh

  • #2
    I don't totally understand what you want, but I think this example may start you in a useful direction.
    Code:
    // Main Filter: Keep Highest Open Interest
    sort T date
    by T date: egen max_oi = max(oi)
    keep if oi == max_oi
    
    // Additional Filter To Keep Single Observation Per Date
    // Choose Contract Closest to 0.95
    gen diff = abs(mon - 0.95)
    sort T date
    by T date: egen min_diff = min(diff)
    
    forvalues t=1/4 {
        generate M`t'_wd = T==`t' & diff==min_diff 
    }
    
    list T date wd M*_wd
    Code:
    . list T date wd M*_wd
    
         +----------------------------------------------------------+
         |  T        date        wd   M1_wd   M2_wd   M3_wd   M4_wd |
         |----------------------------------------------------------|
      1. | M1   31jan1996   .147164       1       0       0       0 |
      2. | M2   31jan1996   .154372       0       1       0       0 |
      3. | M3   31jan1996   .126738       0       0       1       0 |
      4. | M4   31jan1996   .151112       0       0       0       1 |
         +----------------------------------------------------------+

    Comment


    • #3
      Dear William Lisowski

      Thank you for replying to my query.

      The example that you provided is exactly what I was looking for.

      I knew that I had to use forvalues but didn't totally understand how to do so. With your code I finally have some insights on how to proceed with my analysis.

      Thanks again.

      Regards
      Parvesh

      Comment

      Working...
      X