
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • using rangestat to replace . with max values for each id

    i have this dataset

    I produced -next- using the following, code with the aim for the first observation for id = x takes the consecutive row containing the value of -state-

    gen long obs_no = _n
    by id (obs_no), sort: gen next = state[_n+1]
    ///replace the missing with the max state 
    rangestat (max) state2= next, interval(year 2001 2020) by(id)

    However as you can see for the last observation for id = x, i obtain a missing value. How can I instead tell Stata, for those that are missing to take the -state- value belonging to the last row of that ID

    This would be for ID =1 at 2005, next = 3
    ID = 4 AT 2015 = 3
    AT ID = 9 AT 2010 = 3
    AT ID = 2 AT 2001 = 1

    i THOUGHT of using rangestat, but perhaps I've got it wrong, can you pls redirect me?

    * Example generated by -dataex-. For more info, type help dataex
    input float(id year state) long obs_no float next
    1 2001 1  1 2
    1 2004 2  2 3
    1 2005 3  3 .
    2 2007 1  4 3
    2 2020 3  5 1
    2 2001 1  6 .
    4 2009 1  7 3
    4 2015 3  8 .
    9 2006 1  9 3
    9 2010 3 10 .
    format %ty year

  • #2
    No, that's not a job for -rangestat-.
    by id (obs_no), sort: replace next = state if _n == _N


    • #3
      I agree with Clyde Schechter but since it's been mentioned note that the option call to rangestat (from SSC, as should be explained)

       interval(year 2001 2020) 
      means use the closed interval (in a more mathematical notation) [year + 2001, year + 2020] observation by observation. That's unlikely to yield anything with your data beyond missing values. It does not mean (in Stata terms)
       inrange(year, 2001, 2020)
      . which is what I am guessing you are thinking.

