Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Running a loop until 0 real changes made

    Hi,

    Here is a sample of my dataset:

    Code:
    input long(idpers idhous)
    4101 41
    4101 41
    4101 41
    4101  .
    4101  .
    4101  .
    4101  .
    4101  .
    4101  .
    4101  .
    4102 41
    4102 41
    4102 41
    4102 41
    4102 41
    4102 41
    4102 41
    4102 41
    4102 41
    4102 41
    4102 41
    4102 41
    4102 41
    4102 41
    4102  .
    4102  .
    4102  .
    4102  .
    4102  .
    4102  .
    4102  .
    4103 41
    4103 41
    4103 41
    My issue is that the household id (idhous) is missing for some individual id (idpers). For this reason, I ran this:

    Code:
    bys idpers: replace idhous = idhous[_n-1] if missing(idhous)
    and this:

    Code:
    bys idpers: replace idhous = idhous[_n+1] if missing(idhous)
    However, for some reasons, not all of the missing observations are replaced by the following or previous values for each value of idpers - which is what I want. But I noticed that when running the previous code many times, it eventually works.

    That's why I would like to use a loop to do this faster, and to run it until Stata notices me "0 real changes made".

    If there's any better option, I'd be glad to learn it as well.

    Thank you.

    Zsolt

  • #2
    You don't need a loop here.

    First check for minimum and maximum values

    Code:
    egen min = min(idhous), by(idpers) 
    
    egen max = max(idhous). by(idpers)
    Now what you want (presumably) is to replace missings if and only if those non-missing min and max are the same, and

    Code:
    replace idhous = min if min==max & missing(idhous)
    is a one way to do it.

    While you are there, check for inconsistency

    Code:
    list idhous if min != max
    See https://www.stata.com/support/faqs/d...issing-values/ for why you thought you needed in a loop. Copying down gives you a cascade but copying up does not.

    Comment


    • #3
      see #2

      Comment


      • #4
        Thank you for your answer Nick.

        I ran your code and it works fine.

        I then checked for consistency as you suggested and it seems that there are household ids that vary within idpers (where min != max). Obviously, for these observations, the code did not replace missing values.

        I am then wondering what should I do with those observations. Should I drop all values of idpers that I cannot link to a household or only for the years where this occurs?

        Also, if I may ask a last question, I would like to know how to replace missing values by incrementing them from the previous ones.

        To make it a bit more clear:

        Code:
        idpers      age
        4101         45
        4101         46
        4101         47
        4101          .
        4101          .
        4101          .
        4101          .
        I would like to fill missing values for age by(idpers).

        Thank you again for your help.


        Comment


        • #5
          Two questions:

          what should I do with those observations. Should I drop all values of idpers that I cannot link to a household or only for the years where this occurs?
          That has to be your call. Depends entirely on how far you need to know that variable. Either way, an indicator

          Code:
          gen goodid = min == max
          will give you a handle you can use to include or exclude observations.

          Also, if I may ask a last question, I would like to know how to replace missing values by incrementing them from the previous ones.
          See the link given in #2 which does address this question too. See Section 7.

          Also, if the implication is of annual surveys or records at the same date in each year in which age should increase by 1 each year then that should yield to the ipolate command. Again, real data are often much messier than they should be. so checking of input and output is in order.
          Last edited by Nick Cox; 08 Dec 2021, 08:35.

          Comment

          Working...
          X