Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    That's a fair point!

    Comment


    • #17
      Thanks, very much, to both of you.

      Nick is correct that I'm using - rangestat - to produce several summaries in one command line. I have read many of the posts dealing with - rangestat - by both of you, and I've read several other reference documents on - rangestat- (even one in which Clyde is cited), so I I believe - rangestat - was originally written for time-series data. Specifically, I believe, for what I would call a "rolling analysis" of time series data. I definitely know I'm only using a fraction of its capabilities.

      Yes, the both EMPSTRUCTURE and COUNTY are coded as numbers, so that hasn't been a problem. I don't know what Clyde means by "meaningful numeric value." There's currently only 5 different types of EMPSTRUCTURE, for example, and each gets it's own number.

      I've not been aware of - collapse - before, but I'm going to look into it. Thanks Clyde. Currently, after - rangestat - is run, my code: 1) creates a temporary-variable to number the observations, 2) drop all observations with a number greater than one, and then 3) drop the temporary-variable. That leaves me with only one observation with all of the - rangestat - statistical data.

      Before I run Nick's suggested code, I'm going to learn about - collapse - to see if that's a cleaner way of getting the - rangestat - data into only one observation.

      Thanks, again, to both of you.

      Comment


      • #18
        The original motivation for rangestat was if anything to cope with windows for irregular time series except that it was deliberately not tied to time series at all -- insofar as you can have windows in terms of any variable, regularly spaced or not, single-valued or not.

        A more crucial point for you is that it is geared to producing results as new variables aligned with the existing dataset so that if your main motivation is a reduced dataset then collapse, statsby or runby (SSC) sounds more suitable.

        Comment


        • #19
          I don't know what Clyde means by "meaningful numeric value." There's currently only 5 different types of EMPSTRUCTURE, for example, and each gets it's own number.
          Though it is a digression from the main focus of this thread, I think I should explain this. I don't know what "employer type" refers to in your case, but just to illustrate, let me suppose that the five employer types are:
          1. sole proprietorship
          2. unincorporated small business partnership
          3. small privately held corporation
          4. large privately held corporation
          5. publicly traded corporation
          You could code these 1 through 5 as I have done here. But the choice of the numbers 1, 2, 3, 4, and 5 is completely arbitrarily. You could have assigned those to these five categories in any order, and it would be an equally valid and useful way of encoding this data. You don't even have to use those 5 particular numbers. You could have used 0 through 4. Or 30, 62, 95, 176, and 32001. The point is that any five non-negative integers could be assigned in any way to these five categories and that would be an equally effective way to work with them.

          Now, the whole point of the -interval(a, b, c)- option in -rangestat- is to identify for inclusion in calculation those observations where variable b <= a <= c (or, if b and c are numbers rather than variables, the value of a in the observations to be included must differ from the value of a in the current observation by an amount that falls between b and c). Well, given the arbitrariness of the numerical coding of this variable, speaking of relationships like <= or >= or between makes no sense. That's why I dislike the use of -rangestat- in this way.

          The reason it works in this specific case is that b and c are both set to zero, so that -interval(a, 0, 0)- means that the difference between a in the included observations and a in the current observation must be exactly zero, which is a roundabout way of saying that those values must be equal. It is, in fact, meaningful to speak of arbitrary positive integer codes like these as being equal or not. So this can be produce meaningful results. But this is a highly restricted use of -interval()-, and is done to test a relationship, equality, that is more transparently and efficiently tested in other ways. I'd say that using -rangestat- in this way is a bit like using the blade of a table knife to turn a screw. It will work in this case, but a screwdriver would be more appropriate.

          Comment


          • #20
            I don't want to prolong this unduly because there is agreement between Clyde Schechter and myself that you can use rangestat in this way but it's not especially recommended: that would be indirect, and hinges, perhaps uncomfortably, on a limiting case, degenerate intervals.

            A key advantage remains being able to get several summaries from one command. Yet it is no accident that the same thing could be said about collapse.

            That said, a careful search of the help for rangestat for 0 0 will show many uses for that device, including some that are quite tricky to achieve otherwise.

            Comment

            Working...
            X