Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • fillin or expand?

    Dear All, The question is related to a previous post here. Now, the expanded data is
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str10 P float(t x w z)
    "A" 0          2  2 3 
    "B" 1  10.333333  8 4
    "C" 0          1  1 5
    "C" 1         -5  6 9
    end
    The desired outcome is
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str10 P float(t x w z) 
    "A" 0         2 2 3 
    "A" 1         0 2 3 
    "B" 0         0 8 4 
    "B" 1 10.333333 8 4 
    "C" 0         1 1 5 
    "C" 1        -5 6 9 
    end
    There are two observations when P="C" (with t=0 and 1). This is what I need, and don't need to do anything.

    However, when P="A", only one observation is available (t=0), I need another observation to be exactly the same (including the values of MANY other variables (e.g., w z), their values must be the same) as the (t=0) observation but with t=1, and x=0.

    Similarly, when P="B", only one observation is available (t=1), I need another observation to be exactly the same (including the values of MANY other variables (e.g., w z), their values must be the same) as the (t=1) observation but with t=0, and x=0.

    Any suggestions? Thanks.

    Ho-Chuan (River) Huang
    Stata 17.0, MP(4)

  • #2
    Code:
    * Create a numeric ID, as tsset does not accept string
    encode P, gen(np)
    drop P
    tsset t np
    
    * Use a full tsfill
    tsfill, full
    
    * Fill up the missing x with 0
    replace x = 0 if x == .
    
    * Fill up the missing w and z with the other non-missing value
    foreach x of varlist w-z{
        bysort np (`x'): replace `x' = `x'[1] if `x' == .
    }
    
    * Recover that P, if you need that
    decode np, gen(P)
    
    * Show the data
    sort np t
    list, sep(0)
    Results:
    Code:
         +-------------------------------+
         | t          x   w   z   np   P |
         |-------------------------------|
      1. | 0          2   2   3    A   A |
      2. | 1          0   2   3    A   A |
      3. | 0          0   8   4    B   B |
      4. | 1   10.33333   8   4    B   B |
      5. | 0          1   1   5    C   C |
      6. | 1         -5   6   9    C   C |
         +-------------------------------+

    Comment


    • #3
      Dear Ken, Thanks for the suggestion. In case that I have an additional string variable y.
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str10 P float(t x w z) str4 y
      "A" 0          2  2 3  "a"
      "B" 1  10.333333  8 4  "b"
      "C" 0          1  1 5  "c"
      "C" 1         -5  6 9  "c"
      end
      How can I do that? Thanks.
      Ho-Chuan (River) Huang
      Stata 17.0, MP(4)

      Comment


      • #4
        Originally posted by River Huang View Post
        Dear Ken, Thanks for the suggestion. In case that I have an additional string variable y.
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str10 P float(t x w z) str4 y
        "A" 0 2 2 3 "a"
        "B" 1 10.333333 8 4 "b"
        "C" 0 1 1 5 "c"
        "C" 1 -5 6 9 "c"
        end
        How can I do that? Thanks.
        Add:
        Code:
        foreach x of varlist y{
            bysort np (`x'): replace `x' = `x'[_N] if `x' == ""
        }

        Comment


        • #5
          Ken's provided a good solution. Here's an alternative method using -expand- that relies on -t- being some regular sequence. By using -expand-, you don't need explicit replication of the variables you simply want to bring along with you, just the recreation of the -t- variable.

          Code:
          bys P : gen byte nt = 2 - _N + 1
          expand nt
          sort P t
          drop nt
          bys P (t) : replace t = _n - 1
          Result

          Code:
               +------------------------------+
               | P   t          x   w   z   y |
               |------------------------------|
            1. | A   0          2   2   3   a |
            2. | A   1          2   2   3   a |
            3. | B   0   10.33333   8   4   b |
            4. | B   1   10.33333   8   4   b |
            5. | C   0          1   1   5   c |
            6. | C   1         -5   6   9   c |
               +------------------------------+

          Comment


          • #6
            Dear Ken, Thank you for this extra suggestion.
            Ho-Chuan (River) Huang
            Stata 17.0, MP(4)

            Comment


            • #7
              Dear Leonardo, Thanks for this helpful suggestion. But. some of the values of x are not correct. But, after some modifications, it works.
              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input str10 P float(t x w z) str4 y
              "A" 0          2  2 3  "a"
              "B" 1  10.333333  8 4  "b"
              "C" 0          1  1 5  "c"
              "C" 1         -5  6 9  "c"
              end
              
              bys P: egen n = count(t)
              expand 3-n
              bys P (t): gen n1 = _n-1
              replace x = 0 if t != n1
              drop t n
              ren n1 t
              order P t
              Last edited by River Huang; 14 Sep 2021, 17:54.
              Ho-Chuan (River) Huang
              Stata 17.0, MP(4)

              Comment


              • #8
                Glad you got it working. I forgot about the conditions you had placed on -x- and focused on the expansion instead.

                Comment

                Working...
                X