Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create pre-post timing variable based on max sequence

    Hi all, I have some time-series data of children and various "statuses" of their father, which is often missing but, when valid, will take values 0-5 and almost always start at 0 and escalate in ensuing months in some order.

    What I've already done is create a months pre-post variable based on status=5.

    What I want to try to create next are analogous variables for the other statuses, but only when a given sequence does not eventually reach dadstatus=5. So in the below data, if I wanted to create a month_since_first2dad variable, its t=0 would be month 7 for idchild 1, because that child never reaches dadstatus=5. The month_since_first2dad variable would be missing for both idchild 2 and 3 because they eventually reach dadstatus=5. What is the most efficient way to do this?

    Code:
    clear
    input idchild month dadstatus
    1 1 .
    1 2 .
    1 3 .
    1 4 0
    1 5 1
    1 6 1
    1 7 2
    1 8 .
    1 9 .
    1 10 .
    1 11 .
    2 1 .
    2 2 .
    2 3 0
    2 4 2
    2 5 4
    2 6 5
    2 7 5
    2 8 5
    2 9 5
    2 10 5
    2 11 .
    3 1 .
    3 2 .
    3 3 0
    3 4 1
    3 5 3
    3 6 .
    3 7 .
    3 8 1
    3 9 2
    3 10 5
    3 11 5
    end
    
    gen status5 = 1 if dadstatus == 5
    bysort idchild: egen first5dad = min(month) if status5 == 1
    replace first5dad = 0 if first5dad != month & first5dad != . 
    replace first5dad = 1 if first5dad == month
    
    
    gen first5dad_month0 = month if first5dad == 1 
    by idchild: egen first5dad_month = max(first5dad_month0)
    
    gen month_since_first5dad = 0 if first5dad == 0
    replace month_since_first5dad = month-first5dad_month

  • #2
    I'm not building on the code you've shown because it looks overly complicated to me, and, at least at first glance, I don't think it produces correct results.

    The following, if I understand the goal correctly, will do it:
    Code:
    by idchild (month), sort: egen first_5_dad_month = min(cond(dadstatus == 5, month, .))
    gen month_since_first_5_dad = month - first_5_dad_month
    
    forvalues i = 4(-1)0 {
        by idchild (month), sort: egen first_`i'_dad_month = ///
            min(cond(dadstatus == `i', month, .)) if missing(month_since_first_5_dad)
        gen month_since_first_`i'_dad = month - first_`i'_dad_month
    }
    
    drop first_*_dad_month

    Comment

    Working...
    X