Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging one row with multipe rows in another dataset based on dates

    Hi All,

    I have the following two datasets. One has company CEO's. See below:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str50 FullName int DateBecameCEO str10 DateLeftasCEO str6 AcquirorCUSIP
    "Scott D. Cook"          8857 "  4/1/1994" "461202"
    "David W. Grainger"      2922 "  1/1/1995" "384802"
    "Pradeep Sindhu, Ph.D." 13180 " 9/30/1996" "48203R"
    "Bruce A. Crawford"     10671 "12/31/1996" "681919"
    "Catherine L. Hughes"    7305 "  1/1/1997" "91705J"
    end
    format %tdnn/dd/CCYY DateBecameCEO
    The other datafile has acquisition data of US companies between 1995-2020. See below:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int DateAnnounced str6 AcquirorCUSIP
    12784 "084670"
    12784 "806857"
    12784 "806857"
    12784 "872941"
    12784 "872941"
    end
    format %tdnn/dd/CCYY DateAnnounced
    Now I want to merge the files so that every row in the 2nd file shows the CEO that was at that time running the company. The problem is that the 1st file contain 2 variables which state the start date of a CEO and the end date of the CEO. Does anybody know how i can solve this problem.

    My intuition would be to create duplicates of each record of the CEO for each year he was on board. for example you would then get a file like this

    Scott D Cook 2011
    Scott D cook 2012
    Scot D cook 2013
    ....

    Afterwards i could then merge with a 1:m command.

    Anyone who know how to do this properly? Thanks in advance

    NOTE: I know the Acquirorcusip is still a string in the 2nd file (i know how to solve this).


  • #2
    This is a bit confusing. It's not clear to me whether the CEO's in the first data set are the CEO's of the acuiror or CEO's of the firm acquired. I'll assume the former. However, in that case your example data will produce no matches because none of the CUSIPs in the first data set appear in the second, nor vice versa. I'll just assume that in your full data, that problem doesn't arise.

    The first step is to make the DateLeftasCEO a numeric Stata internal format date variable. After that it's a one-liner
    Code:
    use ceo_data_set, clear
    
    //  MAKE DateLeftasCEO A NUMERIC STATA INTERNAL FORMAT DATE VARIABLE
    gen leave_date = daily(DateLeftasCEO, "MDY")
    format leave_date %tdnn/dd/CCYY
    drop DateLeftasCEO
    rename leave_date DateLeftasCEO
    
    //  JOIN THE TWO FILES
    rangejoin DateAnnounced DateBecameCEO DateLeftasCEO using acquisition_data_set, by(AcquirorCUSIP)
    -rangejoin- is written by Robert Picard and is available from SSC. To use it, you must also install -rangestat-, by Robert Picard, Nick Cox, and Roberto Ferrer, also available from SSC.

    Comment

    Working...
    X