Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • using rangestat to replace . with max values for each id

    i have this dataset

    I produced -next- using the following, code with the aim for the first observation for id = x takes the consecutive row containing the value of -state-


    Code:
    gen long obs_no = _n
    by id (obs_no), sort: gen next = state[_n+1]
    
    
    ///replace the missing with the max state 
    rangestat (max) state2= next, interval(year 2001 2020) by(id)

    However as you can see for the last observation for id = x, i obtain a missing value. How can I instead tell Stata, for those that are missing to take the -state- value belonging to the last row of that ID

    This would be for ID =1 at 2005, next = 3
    ID = 4 AT 2015 = 3
    AT ID = 9 AT 2010 = 3
    AT ID = 2 AT 2001 = 1


    i THOUGHT of using rangestat, but perhaps I've got it wrong, can you pls redirect me?


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(id year state) long obs_no float next
    1 2001 1  1 2
    1 2004 2  2 3
    1 2005 3  3 .
    2 2007 1  4 3
    2 2020 3  5 1
    2 2001 1  6 .
    4 2009 1  7 3
    4 2015 3  8 .
    9 2006 1  9 3
    9 2010 3 10 .
    end
    format %ty year


  • #2
    No, that's not a job for -rangestat-.
    Code:
    by id (obs_no), sort: replace next = state if _n == _N

    Comment


    • #3
      I agree with Clyde Schechter but since it's been mentioned note that the option call to rangestat (from SSC, as should be explained)

      Code:
       
       interval(year 2001 2020) 
      means use the closed interval (in a more mathematical notation) [year + 2001, year + 2020] observation by observation. That's unlikely to yield anything with your data beyond missing values. It does not mean (in Stata terms)
      Code:
       inrange(year, 2001, 2020)
      . which is what I am guessing you are thinking.

      Comment

      Working...
      X