Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replace with first group values

    I want to know how to replace the missing values for other groups (by mdate) with the values of group 1 in case of mk and ex variable which are same for all groups. Each group is identified by the stock_id. The illustrative data example is appended.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(date mdate) byte stock_id float rt byte mk int ex
    22281 732 1  .21 20 101
    22312 733 1  .23 30 255
    22340 734 1  .01 40  62
    22371 735 1  .27 50   7
    22281 732 2 1.21  .   .
    22312 733 2 2.52  .   .
    22340 734 2 1.11  .   .
    22371 735 2  .78  .   .
    22281 732 3 3.55  .   .
    22312 733 3  .99  .   .
    22340 734 3 1.76  .   .
    22371 735 3  .29  .   .
    end
    format %td date
    format %tm mdate

  • #2
    Code:
    foreach v of varlist mk ex {
        by mdate (`v'), sort: replace `v' = `v'[1]
    }
    
    sort stock_id mdate
    Note: This solution is slightly more general than the stated problem. It does not require that the stock_id that has the non-missing values have the value stock_id = 1, nor even that it have the numerically smallest value of all stock_ids. The only requirement here is that in any given month, there be only one distinct non-missing value for each of the variables being filled in. There can be more than one non-missing value, but they must not disagree.

    Comment


    • #3
      Clyde Schechter what's the difference between "by [varlist], sort" and "bys [varlist]"?

      I've always used bys/qbys, presuming it meant the same thing as the code you used.

      Comment


      • #4
        There is no difference. In the very early days of Stata, only -by [varlist], sort- existed. The -bysort [varlist]- and abbreviation -bys [varlist]- were introduced later, after I had already been using -by [varlist], sort- several times a day every day for some years. The habit stuck with me, especially since I'm a good, fast typist and not easily enticed to change habits by the prospect of sparing a couple of keystrokes. In fact, it's even more ingrained than a mere habit. My fingers just go and type it out of muscle memory without me even consciously thinking about it.

        More generally I'm a creature of habit. I'm one of the few people who consistently codes -gen byte dichotomous_variable = ...-. And I always -compress- any newly created data set before I -save- it. Now, with the memory available in modern computers, it is rarely necessary to care about sparing memory, nor disk space, at least not in statistical applications. But I started programming back in the early 1960's when memory was very expensive and that of a really large computer was denominated in modest numbers of kilobytes, so you had to agonize over every bit you used when programming anything more than a toy problem. And the habit of using the smallest amount of memory possible has largely stuck with me.
        Last edited by Clyde Schechter; 07 Apr 2022, 23:24.

        Comment

        Working...
        X