Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Winsorized the variables at leve 1% and 99 %

    How do I winsorized the variables at level 1% and 99% levels for each year .? Winzorised to ddress problems caused by small denominators and to control for the effect of potential outliers.

  • #2
    try this
    Code:
    ssc install winsor
    help winsor
    Regards
    --------------------------------------------------
    Attaullah Shah, PhD.
    Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
    FinTechProfessor.com
    https://asdocx.com
    Check out my asdoc program, which sends outputs to MS Word.
    For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

    Comment


    • #3
      winsor2 from SSC (same site, different author) is closer to what is wanted here.

      Comment


      • #4
        You will probably miss most outliers if you winsorize 1% in each tail. Studies of high quality data generally show percentages of gross errors higher than 1% in each tail, sometimes much higher. (Hampel, 1986). To that end, I disagree with the default levels of 1% winsorization in winsor2. In the literature on robustness, you will commonly see 10% or even 20% winsorized (or trimmed) means.

        Also, winsorizing and trimming can be bettered by other methods which adapt to likely outliers, and which do not require much of an advance guess about how many there are.


        If you winsorize a variable that is destined to be the response in a regression, you probably be altering the wrong observations. You should be reducing the influence of very large residuals, not the original values. For regression, the robust regression package mmregress by Verardi and Croux is superior (findit). In providing a resistant fit, mmregress also identifies outliers and high leverage points.

        References:

        Verardi, V., and C. Croux. 2009. Robust regression in Stata. Stata
        Journal 9, no. 3: 439-453.

        Hampel, Frank, Elvezio Ronchetti, Peter Rousseeuw, and Werner Stahel.
        1986. Robust Statistics: The Approach Based on Influence Functions
        (Wiley Series in Probability and Mathematical Statistics). New York:
        John Wiley and Sons.
        Last edited by Steve Samuels; 08 Jan 2015, 20:56.
        Steve Samuels
        Statistical Consulting
        [email protected]

        Stata 14.2

        Comment


        • #5
          Hi Basiem,

          try the following:

          Code:
          winsor2 varname, cuts(1 99) by(year)
          Notice that the new winsorized variable will have the name varname_w (i.e. the suffix is _w). However, if you want to change it try:

          Code:
          winsor2 varname, suffix(_w) cuts(1 99) by(year)
          so that you can change the parenthesis (_w) and name the winsorized variable the way you want to.

          Comment


          • #6
            You might want to winsorize by(id year). This makes more sense to me (we don't know any details about your data).

            Code:
             
             winsor2 varname, suffix(_w) cuts(1 99) by(id year)

            Comment


            • #7
              Hi sorry to bring this thread back up. I am investigating the effect of IR derivative usage on firm value, where my focus variable is IR derivative and there a controls:

              Code:
               .  regress firm value lnassets IRderivatives 10 bookleverage_w1 roa_w1 cratio_w1 rnd_rev_w1  div_yield_w1
              this regression contains all winsorized variables, and I have found that the coefficient on Ir is postive and significant at 0.05

              Can you use the same regression without the winsorized variables as a check of robustness?

              I ran the same regression again without the winsorized variables and I found that while the magnitudes of all the controls changed, they still stayed the same in terms of sign and significance, and the there that ir coefficient was 0.1 and still postive and significant. Can I claim that the results are not senstive to outliers? Thanks

              Comment

              Working...
              X