Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to use collapse in my panel data set?

    I have a panel data set. Every id has tons of observations. I also create a dummy, called non, which shares the same value within id. In other words, for a specific id, its tons of observations has only one value (o or 1) of non.

    Now I want to know how many ids are 0 of non and 1 of non. I want to do this, but it failed:
    Code:
    collapse (min) non, by(id)
    distinct(id) if non == 1
    Collapse command failed. After doing this collapse command, all id have non zero.
    Why?

  • #2
    Well, first, are you sure that the variable non is correctly constructed so that it is truly always 0 or always 1 within an id? You can check that with
    Code:
    by id (non), sort: assert non[1] == non[_N]
    If that runs without error, then non is truly invariant within id. Otherwise, the problem lies with non, and you will need to revisit how you created it.

    If non is correctly created, then the results you are getting imply that non is in fact zero in every id. Why are you certain that isn't correct? If it really can't be correct, then, again, the problem is with how non was created--there is nothing wrong with the code you show in #1.

    As an aside, you don't need to invoke -distinct- in that second command. After -collapse- you will have exactly one observation per id, so -count if non == 1- will be sufficient for the purpose at hand.

    Comment


    • #3
      You're right. non is created wrong. I'm interested in you asser code.

      assert non[1] == non[_N] I think it asserts the first obs is the same as the last obs within each group? Not assert that every obs is the same within each group.
      I also changed it to
      assert non[1] == non[_n] it's the same outcome?.

      Comment


      • #4
        Yes, assert non[1] == non[_N] asserts that the first obs is the same as the last within each group. BUT, the command also sorts the data by id, and then by non within id. So since the data are sorted by non (within id) the first and last being the same implies that they are all the same.

        Yes assert non[1] == non[_n], or even more simply, assert non == non[1] will produce the same result. But it will take much longer if your data set is large because it requires many more comparisons.

        Comment


        • #5
          See also https://www.stata.com/support/faqs/d...ions-in-group/ for discussion.

          Comment

          Working...
          X