Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • counting non-missing observation across a variable in a range of date.

    Hi,
    I have a data with 200,000 observations (identified by ISIN number). The data is in time series format and contains the observations related to 1200 companies (identified by ISIN number). In the example below the observations related to two of those companies are shown.
    The first column shows the companies id (ININ number), and the second column presents the Ask price (Ask) for each company in a range of specific date that the Ask price data were available and missing value in the same column shows that the data was not available on a specific date.
    I want to generate a new variable (NUM) which indicates the number of days that each company has the non-missing value in a range of date 01-01-2016 to 03-30-2016.

    I am not sure how to do this with "count" command. I would be grateful if you help in generating the mentioned new variable.


    input str12 ISIN double Ask float date1
    "AT00000AMAG3" 31.68 20457
    "AT00000AMAG3" 31 20458
    "" . 20459
    "AT00000AMAG3" 31 20460
    "AT00000AMAG3" 31 20461
    "" . 20462
    "" . 20463
    "AT00000AMAG3" 31 20464
    "AT00000AMAG3" 30.9 20465
    "AT00000AMAG3" 30.785 20466
    "AT00000AMAG3" 30.5 20467
    "AT00000AMAG3" 30.495 20468
    "" . 20469
    "" . 20470
    "AT00000AMAG3" 29.925 20471
    "AT00000AMAG3" 30 20472
    "AT00000AMAG3" 29.485 20473
    "AT00000AMAG3" 29.49 20474
    "AT00000AMAG3" 30.39 20475
    "" . 20476
    "" . 20477
    "AT00000AMAG3" 29.82 20478
    "AT00000AMAG3" 29.795 20479
    "AT00000AMAG3" 29.885 20480
    "AT00000AMAG3" 29.995 20481
    "AT00000AMAG3" 29.6 20482
    "" . 20483
    "" . 20484
    "AT00000AMAG3" 29.45 20485
    "AT00000AMAG3" 29.2 20486
    "AT00000AMAG3" 28.115 20487
    "AT00000AMAG3" 28.25 20488
    "AT00000AMAG3" 28.42 20489
    "" . 20490
    "" . 20491
    "AT00000AMAG3" 28.2 20492
    "AT00000AMAG3" 28 20493
    "AT00000AMAG3" 28 20494
    "AT00000AMAG3" 26.795 20495
    "AT00000AMAG3" 26.035 20496
    "" . 20497
    "" . 20498
    "AT00000AMAG3" 26.5 20499
    "AT00000AMAG3" 26.68 20500
    "AT00000AMAG3" 26.98 20501
    "AT00000AMAG3" 26.725 20502
    "AT00000AMAG3" 27.075 20503
    "" . 20504
    "" . 20505
    "AT00000AMAG3" 27.4 20506
    "AT00000AMAG3" 26.98 20507
    "AT00000AMAG3" 27.165 20508
    "AT00000AMAG3" 27.22 20509
    "AT00000AMAG3" 27.08 20510
    "" . 20511
    "" . 20512
    "AT00000AMAG3" 28.495 20513
    "AT00000AMAG3" 29.98 20514
    "AT00000AMAG3" 29.435 20515
    "AT00000AMAG3" 30.48 20516
    "AT00000AMAG3" 29.995 20517
    "" . 20518
    "" . 20519
    "AT00000AMAG3" 30.21 20520
    "AT00000AMAG3" 30.1 20521
    "AT00000AMAG3" 30.06 20522
    "AT00000AMAG3" 29.18 20523
    "AT00000AMAG3" 30.12 20524
    "" . 20525
    "" . 20526
    "AT00000AMAG3" 31.24 20527
    "AT00000AMAG3" 30.5 20528
    "AT00000AMAG3" 30.1 20529
    "AT00000AMAG3" 30.245 20530
    "AT00000AMAG3" 30.49 20531
    "" . 20532
    "" . 20533
    "AT00000AMAG3" 30.5 20534
    "AT00000AMAG3" 29.58 20535
    "AT00000AMAG3" 30.46 20536
    "AT00000AMAG3" 31.255 20537
    "" . 20538
    "" . 20539
    "" . 20540
    "" . 20541
    "AT00000AMAG3" 31.16 20542
    "AT00000AMAG3" 31.22 20543
    "AT00000AMAG3" 31 20544
    "AT00000VIE62" 21.3775 20457
    "AT00000VIE62" 21.68 20458
    "" . 20459
    "AT00000VIE62" 21.21 20460
    "AT00000VIE62" 20.79 20461
    "" . 20462
    "" . 20463
    "AT00000VIE62" 20.775 20464
    "AT00000VIE62" 20.47 20465
    "AT00000VIE62" 20.725 20466
    "AT00000VIE62" 20.31 20467
    "AT00000VIE62" 20.6525 20468
    end


  • #2
    Columns: always say variables in Stataland.

    count can't create variables. You could loop over multiple uses of count, but there are much better ways.

    Thanks for the data example, but please use CODE delimiters. Your data need surgery, because of the missing values on ISIN. The code here works fine for your example, but missing values at a beginning of a panel would be harder work to fix.

    Code:
    clear 
    input str12 ISIN double Ask float date1
    "AT00000AMAG3" 31.68 20457
    "AT00000AMAG3" 31 20458
    "" . 20459
    "AT00000AMAG3" 31 20460
    "AT00000AMAG3" 31 20461
    "" . 20462
    "" . 20463
    "AT00000AMAG3" 31 20464
    "AT00000AMAG3" 30.9 20465
    "AT00000AMAG3" 30.785 20466
    "AT00000AMAG3" 30.5 20467
    "AT00000AMAG3" 30.495 20468
    "" . 20469
    "" . 20470
    "AT00000AMAG3" 29.925 20471
    "AT00000AMAG3" 30 20472
    "AT00000AMAG3" 29.485 20473
    "AT00000AMAG3" 29.49 20474
    "AT00000AMAG3" 30.39 20475
    "" . 20476
    "" . 20477
    "AT00000AMAG3" 29.82 20478
    "AT00000AMAG3" 29.795 20479
    "AT00000AMAG3" 29.885 20480
    "AT00000AMAG3" 29.995 20481
    "AT00000AMAG3" 29.6 20482
    "" . 20483
    "" . 20484
    "AT00000AMAG3" 29.45 20485
    "AT00000AMAG3" 29.2 20486
    "AT00000AMAG3" 28.115 20487
    "AT00000AMAG3" 28.25 20488
    "AT00000AMAG3" 28.42 20489
    "" . 20490
    "" . 20491
    "AT00000AMAG3" 28.2 20492
    "AT00000AMAG3" 28 20493
    "AT00000AMAG3" 28 20494
    "AT00000AMAG3" 26.795 20495
    "AT00000AMAG3" 26.035 20496
    "" . 20497
    "" . 20498
    "AT00000AMAG3" 26.5 20499
    "AT00000AMAG3" 26.68 20500
    "AT00000AMAG3" 26.98 20501
    "AT00000AMAG3" 26.725 20502
    "AT00000AMAG3" 27.075 20503
    "" . 20504
    "" . 20505
    "AT00000AMAG3" 27.4 20506
    "AT00000AMAG3" 26.98 20507
    "AT00000AMAG3" 27.165 20508
    "AT00000AMAG3" 27.22 20509
    "AT00000AMAG3" 27.08 20510
    "" . 20511
    "" . 20512
    "AT00000AMAG3" 28.495 20513
    "AT00000AMAG3" 29.98 20514
    "AT00000AMAG3" 29.435 20515
    "AT00000AMAG3" 30.48 20516
    "AT00000AMAG3" 29.995 20517
    "" . 20518
    "" . 20519
    "AT00000AMAG3" 30.21 20520
    "AT00000AMAG3" 30.1 20521
    "AT00000AMAG3" 30.06 20522
    "AT00000AMAG3" 29.18 20523
    "AT00000AMAG3" 30.12 20524
    "" . 20525
    "" . 20526
    "AT00000AMAG3" 31.24 20527
    "AT00000AMAG3" 30.5 20528
    "AT00000AMAG3" 30.1 20529
    "AT00000AMAG3" 30.245 20530
    "AT00000AMAG3" 30.49 20531
    "" . 20532
    "" . 20533
    "AT00000AMAG3" 30.5 20534
    "AT00000AMAG3" 29.58 20535
    "AT00000AMAG3" 30.46 20536
    "AT00000AMAG3" 31.255 20537
    "" . 20538
    "" . 20539
    "" . 20540
    "" . 20541
    "AT00000AMAG3" 31.16 20542
    "AT00000AMAG3" 31.22 20543
    "AT00000AMAG3" 31 20544
    "AT00000VIE62" 21.3775 20457
    "AT00000VIE62" 21.68 20458
    "" . 20459
    "AT00000VIE62" 21.21 20460
    "AT00000VIE62" 20.79 20461
    "" . 20462
    "" . 20463
    "AT00000VIE62" 20.775 20464
    "AT00000VIE62" 20.47 20465
    "AT00000VIE62" 20.725 20466
    "AT00000VIE62" 20.31 20467
    "AT00000VIE62" 20.6525 20468
    end
    
    local first = mdy(1,1,2016) 
    local last = mdy(3,30,2016)
    replace ISIN = ISIN[_n-1] if missing(ISIN) & date1 == date1[_n-1] + 1 
    egen wanted = total(!missing(Ask) & inrange(date1, `first', `last')), by(ISIN) 
    bysort ISIN (date1) : gen count = _N 
    tabdisp ISIN, c(count wanted) 
    
    -------------------------------------
            ISIN |      count      wanted
    -------------+-----------------------
    AT00000AMAG3 |         88          60
    AT00000VIE62 |         12           9
    -------------------------------------
    .

    Comment


    • #3
      Thank you very much Nick! It works well now. With your help, it is doable now.
      I apologize for the inappropriate submission of the example.

      Comment

      Working...
      X