Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • value labeling of specific observations

    Dear list members,

    extended missing values are useful to keep track of reasons why certain data is missing, or was coded to missing. The reasons may be multiple, and different labels can be created and attached, but the values otherwise behave in the same way, which is also useful.

    However, this would also be useful with non-missing values, most clearly when some data was cleaned, trimmed, rounded, imputed, interpolated etc. The default option here is to create another variable - which may eventually have to be done for some operations, anyways, but perhaps at different stages of the workflow - with consequent benefits.

    This would call for for value labels that can be more specific than simply a mapping to a number. I would want, for one variable, to distinguish the value "2" that was interpolated from the value "2" that was measured.

    Is there, or could there conceivably be, a way to do this? I suppose I'm not the first to think about this, so perhaps I'm overlooking something easy with labmask (findit labmask) or something.
    Last edited by Matteo Pinna Pintor; 26 Nov 2024, 05:06.
    I'm using Stata/MP 17

  • #2
    I see the point, but I can't see how that it would be compatible with the idea of value labels. As you say, I think you would need to hold the information in another variable.

    Comment


    • #3
      It would need to be something like "attach label Y to value X if W applies" - if W is the existence of another variable, the benefit becomes dubious (although it might still be nice to see the variable color-coded for eg a new user to get a visual grasp of how much modification is ongoing, conditional on the fact that too much color coding degrades its usefulness). But if W can be something else, like the observation number - but I also feel observation numbers should probably not be given important roles. Uhm. Well, food for thought for Stata staff maybe? A minor issue, to be sure.
      I'm using Stata/MP 17

      Comment


      • #4
        Let's take it that the primary (only?) role of value labels is to improve display of output -- either in tabular or in graphical form.

        What would Stata be expected to do -- in your example --

        with row or column headers in a table

        or

        with axis labels on a graph

        -- given values of 2, some of which should be labelled and some shouldn't?

        Comment


        • #5
          I guess the potential benefits would concern the process rather than the output. I would stress the parallel with extended missing values. Labeling them differently can, sometimes, be useful - and most of the times it probably makes no difference in graphical or tabular output. And what makes it useful - e.g. distinguishing types of survey non-response - doesn't really differ much from reasons why the research worker may want to slightly modify existing data values - these are all material aspects of the collection and analysis process that one may want to keep in mind and check the robustness of results to them.

          Nothing that can't be done by creating more variables of course. But that's the nature of micro-improvements.
          I'm using Stata/MP 17

          Comment


          • #6
            I can't see that you're answering my question, but I leave it there. This is for StataCorp and might better be echoed in the thread on requests for Stata 19.

            Comment

            Working...
            X