Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Box plot - Only lable Most extreme outliers

    Dear Statalists,

    after going through several tutorials and information about box plots in Stata, I managed to get the to the following output:
    Code:
    graph box ROA, box(1, fcolor(gray) lcolor(gray)) alsize(99) cwhisker marker(1, mlab(Name)) title(Boxplot ROA, color(black)) scheme(s1mono)
    As I have numerous extreme values in both directions, is there any way how I could only add the labels to say the most 3 extreme values in each direction? Maybe sth. like mlable if ...?
    Click image for larger version

Name:	Graph.jpg
Views:	1
Size:	61.7 KB
ID:	1538135

    Last edited by Jon Hoefer; 24 Feb 2020, 06:30.

  • #2
    it's programmable, but the results would be ugly and the project isn't tempting personally. I can offer this using extremes (SSC).

    Code:
    . sysuse auto, clear
    (1978 Automobile Data)
    
    . extremes price make, n(3)
    
      +-------------------------------+
      | obs:   price   make           |
      |-------------------------------|
      |  34.   3,291   Merc. Zephyr   |
      |  14.   3,299   Chev. Chevette |
      |  18.   3,667   Chev. Monza    |
      +-------------------------------+
    
      +-------------------------------+
      |  27.   13,594   Linc. Mark V  |
      |  12.   14,500   Cad. Eldorado |
      |  13.   15,906   Cad. Seville  |
      +-------------------------------+
    Don't ROA people ever plot on cube root or asinh scale?

    Comment


    • #3
      Dear Nick, thanks for your suggestions. I already included the hilo command on top of it but the extremes command is most definitely even better. Thanks for sharing.
      Maybe one more not Stata related question: Can you think of any great scientific source that covers the handling of outliers? I in fact I will include them because after researching the driver of the extreme values, all are explainable (in fact due to some accounting reasons) and hence I do not see the necessity for an exclusion. I have some good sources, they are however not in English, hence I would prefer some English once, just in case you have any in mind.

      Comment


      • #4
        https://www.wiley.com/en-us/Outliers...-9780471930945 seems to me the leading work.

        Comment


        • #5
          Thank you! Just a quick follow up question: I read into some of the standard procedures for outliers in Stata which involve post regression procedures, however as I run several panel models invoking robust errors none of them seem to work. Does invoking robust standard errors (to be precise cluster on firm level) also help to decrease the impact of outliers as a byproduct and hence I can stop right here or is this insufficient?

          Comment


          • #6
            Robust standard errors leave the parameter estimates unchanged, so in that sense they have nothing to do with treating outliers differently.

            Comment

            Working...
            X