Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fill in missing values only if the nonmissing boundary observations are same

    Dear Statalisters,
    Any thoughts on this problem where I want to fill in the missing observations in a dataset only if the boundary nonmissing observations are the same. Consider the example dataset below
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(sn x)
    1 40
    2  .
    3  .
    4 40
    5  .
    6  .
    7 35
    end

    I would want to fill the missing observations in the dataset so that 40 is carried forward only if the next non-missing observation is 40. ..The value 40 is not carried forward in the next missing set (in obs 5 and 6) as the next non-missing value is not 40. To get dataset as below

    1 40
    2 40
    3 40
    4 40
    5 .
    6 .
    7 35

    thanks in advance for advice.

    regards
    Ram
    Last edited by ram singh; 17 Jun 2018, 19:51.

  • #2
    The comprehensive advises for this issue have been given by Nick Cox at https://www.statalist.org/forums/for...forward-values. I just have a small contribution which directly solves for your example.
    Code:
    ipolate x sn, gen(_x)
    replace x=_x if x==. & _x == _x[_n-1]
    drop _x

    Comment


    • #3
      thx Romalpa, much appreciated

      Comment


      • #4
        Note that Romalpa's test of acceptability in #2 is more general than that (non-integers are not acceptable) in the 2014 thread cited. There, the context was observed integers increasing monotonically. In this example it seems possible that integers could arise spuriously. That is, interpolating in 40 . 42 would yield 41.
        Last edited by Nick Cox; 18 Jun 2018, 00:00.

        Comment


        • #5
          Thank Nick for your note, which is exactly what I am trying to contribute: The code should be working, not only for integers, but also for any numeric values (which might be allocated spuriously).

          Anyhow, the "backward and forward solution" of yours is still safer and more general if taking into account the type of the relevant variable.
          Last edited by Romalpa Akzo; 18 Jun 2018, 01:19.

          Comment

          Working...
          X