Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • identify entrants and exits - string variables

    I have a panel with different names over time and need to identify the new entrants and the dropped exits. Here is an example dataset.

    Code:
    clear
    input timeid str2 name1 str2 name2 str2 name3 str2 name4 str2 name5 str2 name6 
    1 "a" "b" "c" "d" "e"
    2 "c" "b" "d" "e" "f"
    3 "a" "d" "e" "b" "c" 
    4 "c" "b" "a" "f" "g" "h"
    tsset timeid
    end
    I would like two outputs.

    The first would have entrants (one per row) and look like:
    Code:
    clear
    input timeid str2 entrant
    2 "f"
    3 "a"
    4 "g"
    4 "h"
    end
    The second would have exits (one per row) and look like:
    Code:
    clear
    input timeid str2 exit
    2 "a"
    3 "f"
    4 "e"
    end
    My difficulty is that the names are not in the same variable position/name/label over time. I suppose you need to convert the original data into a name-year panel (one name and time per row). But I'm not sure how to do that simply. (The actual dataset has thousands of names.)

    Thanks in advance for the help.

  • #2
    I'm having difficulty reconciling what you say you want in words with what you show as your desired end results. For example, looking at the original data, "f" enters at timeid 2 and exits at timeid 3, as you note. But then "f" also enters again at timeid 4. Similarly, it seems to me that "d" exits at timeid 4, but you do not show that.

    On the assumption that your examples are incorrect and that you meant literally what you said in words, the following code will work:
    Code:
    reshape long name, i(timeid)
    drop if missing(name)
    drop _j
    fillin name timeid
    
    
    by name (timeid), sort: gen byte entrant = _fillin == 0 & _fillin[_n-1] == 1
    by name (timeid): gen byte exiter = _fillin== 1 & _fillin[_n-1] == 0
    
    frame put timeid name if entrant, into(entrances)
    frame entrances {
        sort timeid name
        list, noobs clean
    }
    
    frame put timeid name if exiter, into(exits)
    frame exits {
        sort timeid name
        list, noobs clean
    }
    If your examples are correct, then I do not understand exactly what you want, and would appreciate a more detailed explanation in words of how you arrived at those results.

    Comment


    • #3
      Clyde Schechter

      You are correct. "f" enters at 4 (as well as 2). Also, I failed to identify that "d" exits at 4. Sorry for the omissions and confusion.

      Your first cpl lines of code are very slick.

      Thank you for the assistance.

      Comment

      Working...
      X