Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Maintaining equality of equations even after winsorizing

    Dear all
    I have a dataset with equality based on the below formula
    Income+borrowings=consumption+savings.
    Assume I have a panel of households. I would like to winsorize all the 4 variables within some range (1 99 or 5 95) maintaining the equality post-winsorization. That is, my equality should not be affected by winsorization, but at the same time, I need to deal with outliers.
    Is it possible to winsorize in such manner? Any advice in this regard will be highly helpful to me. Sorry if I am asking a naive question

  • #2
    If Income + borrowings = consumption + savings were true before Winsorizing it is going to be true after Winsorizing whenever

    1. none of the variables were Winsorized in those observations

    2. changes by Winsorizing cancel each other (e.g. income and savings are both reduced by the same amount)

    It should be easy enough to count before and after. Making the point with shorter variable names: Suppose you Winsorize i b c s to iw bw cw sw then you can

    Code:
    count if i + b == c + s 
    
    count if iw + bw == cw + sw
    and look for the reduction in number of observations.

    I have a small stake here insofar as people kept asking for Winsorizing on this list and in 1998 I posted a command on SSC to do it, more as a programming puzzle than because I wanted it myself. The extent to which people want to use such methods to clobber their data is nevertheless alarming to me. In my field outliers are out-and-out errors to be omitted altogether or genuine extremes to be accommodated in the analysis, but I do gather that in some parts of economics attitudes differ.

    Comment


    • #3
      Thank you very much, Nick. In my case, observations related to 1st point were there(none of the 4 variables got winsorized) but 2nd point didn't happen to many observations, i.e a reduction in LHS was not followed by an exact reduction in RHS. Hence equality got violated at many post-winsorized cases.
      I agree with your point on winsorization but most economics-articles require winsorizing to make sure that results are not driven because of some influencing figures.
      Once again, thanks Nick

      Comment


      • #4
        most economics articles require winsorizing
        That is a testable hypothesis, and FWIW I much doubt it.

        I think the approach would get flagged in (more) introductory texts if it were true Jeff Wooldridge and any others may wish to comment.

        Comment


        • #5
          The potential problem is the influence of influential cases. You need some sensible solution. Any sensible solution will do, and any non-sensible solution will be frowned upon, even in economics. Winsorizing is a possible "solution" for some situations but not all. In and of itself it is just a mindless algorithm. So applied on its own is absolutely horrible. At the very least you should have identified the influential cases yourself and inspected all of them them carefully and decided what is "going on" with each case and what is the best way of handling it. Personally, I would never use winsorizing for the main analysis, but maybe for a sensitivity analysis. Anyhow, for your analysis winsorizing is a non-sensible solution for your application, as you already noticed. So that is the end of that discussion; you need to find another way of handling influential cases.
          ---------------------------------
          Maarten L. Buis
          University of Konstanz
          Department of history and sociology
          box 40
          78457 Konstanz
          Germany
          http://www.maartenbuis.nl
          ---------------------------------

          Comment

          Working...
          X