Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PANEL DATA: Checking if a present variable value is present in universe of lags

    Hi all,

    I'm currently working with an unbalanced panel dataset of factories and am trying to check for each instance of variable "firmid" (firm that owns the factory) if the value of that instance is present in the universe of lagged firmids. I have already destrung and encoded the firmids so that I can run panel operations on them.

    What I am trying to do is create a new variable isOLD that tells me for each and every instance whether that firmid is present in the universe of lags.

    Code:
     
    FactoryID YEAR firmid
    1 2010 1
    2 2010 1
    3 2010 1
    4 2010 2
    5 2010 2
    1 2011 1
    2 2011 3
    3 2011 1
    4 2011 2
    5 2011 4

    the variable I am trying to create should look like this:

    Code:
     
    FactoryID YEAR firmid isOLD
    1 2010 1 .
    2 2010 1 .
    3 2010 1 .
    4 2010 2 .
    5 2010 2 .
    1 2011 1 1
    2 2011 3 0
    3 2011 1 1
    4 2011 2 1
    5 2011 4 0
    (Missing values in first period because they have no lags)

    Is there a simple way to do this (preferably without a for-loop)?

    My current path of reasoning (that uses a for-loop) goes like this:

    Code:
    gen lfirm = l.firmid
    gen isOLD = 0
    
    foreach val of firmid {
        foreach lval of lfirm {
            replace isOLD = 1 if val == lval
        }
    }
    But it seems the syntax is off with this and even if it wasn't, a nested for-loop is going to get real slow once I'm using thousands of factoryIDs.

    Thank you in advance for all the help!

  • #2
    Code:
    by firmid (year), sort: gen isOLD = _n > 1
    summ year, meanonly
    replace isOLD = . if year == `r(min)'
    
    sort year factoryid

    Comment


    • #3
      Clyde, I owe you one. Thanks for the succinct help.

      Comment

      Working...
      X