Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rug plot for rdplot

    Dear Statalist Community,

    I am trying to create a rug plot for my rdplot graph, however, I'm struggling. I can place the two graphs one on top of the other, but I cannot overlay them. I havent found a command that allows me to overlay with rdplot.

    Code:
    *RD Graph:
    rdplot dv running if nat==1 & tec==1, c(1990) all ///
    kernel(tri) covs(c1 c2 c3) weights(w)   ///
    p(1) graph_options(aspectratio(1) legend(off))
    *save manually: rdplot_graph
    
    *histogram
    histogram running if nat == 1 & tec == 1, width(1) frequency ///
    aspectratio(1) saving(histo)
    
    gr combine rdplot_graph.gph histo.gph, col(1) iscale(1)
    If useful, when I run rdplot I get:

    Code:
    RD Plot with evenly spaced mimicking variance number of bins using polynomial regression.
    
          Cutoff c = 1990 | Left of c  Right of c        Number of obs  =       1960
    ----------------------+----------------------        Kernel         = Triangular
            Number of obs |        70        1890
       Eff. Number of obs |        70        1890
      Order poly. fit (p) |         1           1
         BW poly. fit (h) |    20.000      57.000
     Number of bins scale |     1.000       1.000
    
    Outcome: dv. Running variable: running.
    ---------------------------------------------
                          | Left of c  Right of c
    ----------------------+----------------------
            Bins selected |        42          35
       Average bin length |     0.476       1.629
        Median bin length |     0.476       1.629
    ----------------------+----------------------
        IMSE-optimal bins |         3           4
      Mimicking Var. bins |        42          35
    ----------------------+----------------------
    Rel. to IMSE-optimal: |
            Implied scale |    14.000       8.750
        WIMSE var. weight |     0.000       0.001
        WIMSE bias weight |     1.000       0.999
    ---------------------------------------------
    
    Covariate-adjusted estimates. Additional covariates included: 10
    
    . *save manually: rdplot_graph
    .
    end of do-file
    Any chance you could kindly help me create a rug plot for my rdplot graph? I was also wondering if the histogram should exclusively include the part of the sample used in the model, and not the overall "running" sample, but maybe I'm wrong.

    Thank you so much!

    Best,
    Cat

  • #2
    The allusion is to rdplot from the Stata Journal, except that the version on SSC is more recent.

    Code:
    SJ-17-2 st0366_1  . .  rdrobust: Software for regression-discontinuity designs
            . . . . .  S. Calonico, M. D. Cattaneo, M. H. Farrell, and R. Titiunik
            (help rdrobust, rdbwselect, rdplot if installed)
            Q2/17   SJ 17(2):372--404
            describes a major upgrade to the Stata (and R) rdrobust package,
            which provides a wide array of estimation, inference, and
            falsification methods for the analysis and interpretation of
            regression-discontinuity designs
    
    SJ-14-4 st0366  . .  Robust data-driven inference in reg.-discontinuity design
            . . . . . . . . . . . . . S. Calonico, M. D. Cattaneo, and R. Titiunik
            (help rdrobust, rdbwselect, rdplot if installed)
            Q4/14   SJ 14(4):909--946
            conducts robust data-driven statistical inference in
            regression-discontinuity designs
    Detail: An earlier rdplot from myself with quite different goals has now been renamed rdistplot.

    The authors of your rdplot don't appear to be active here. You could try contacting them directly.

    A puzzle is that I don't recall ever seeing a histogram regarded as a rug plot. Although Tufte used the term differently in 1983, the sense most familiar to me is a one-dimensional scatter alongside either axis of a twoway plot. The term "rug plot" in this sense was used by Hastie and Tibshirani in 1990 in their Generalized Additive Models book (and perhaps earlier: references welcome) but the idea is much older. Usually in my reading the rug is horizontal, justifying the name.

    Be that as it may -- the essentials for a rug are the variable you want shown and a variable holding their positions along the axis in question.

    Here is a trivial example.

    Code:
    sysuse auto, clear
    scatter mpg weight
    su mpg, meanonly
    gen y = r(min) - 0.04 * (r(max) - r(min))
    
    scatter mpg weight || scatter y weight, ms(|) msize(medlarge) legend(off) ytitle("`: var label mpg'")
    Now you can't do that with rdplot. So, why mention it at all? You might be able to smuggle in some code using graph_opts(addplot()) but I haven't tried it and have no strong presumption that it would work.

    More generally, this sounds like something that many people might want, so you might ask the authors nicely to implement it or to document how it could be done. Be prepared for the authors responding, as I sometimes do, that sounds like a good idea, and feel free to write your own code, as we doubt that we will get to that in the foreseeable future.

    Comment


    • #3
      Dear Nick,

      Thank you so much for your very detailed answer. I really appreciate you taking the time to explain precisely what a rug plot is and for providing this code as an example. I was struggling to even find a Stata example of a rug plot, which is why I was using the histogram as a (bad) approximation.

      Following your suggestion, I will contact the authors. I will share my reply here in case they respond.

      Once again, thank you very much for your time and fantastic help.


      Best wishes,
      Cat



      Comment


      • #4
        Thanks for the thanks. If it worked the syntax would start with graph_options(addplot()).

        My wording is wrong!

        I said

        a variable holding their positions along the axis in question
        but that should have been

        a variable holding their (identical) positions on the opposite axis
        A minimum for the original authors is to allow addplot() if their existing syntax does not.
        Last edited by Nick Cox; 15 Nov 2024, 06:52.

        Comment


        • #5
          Dear Nick,

          Thank you very much for the update.

          Kind regards,
          Cat

          Comment


          • #6
            My solution:

            Dear All,

            I did email the authors who were very responsive. They shared the code on how to manually create the rdplot, and that's what I did. Then, I included Nick's code for the rug plot, and now I think everything works.

            Here is the code I used, in case it might be useful:

            Note: The only caveat is that I had to remove my covariates, but that did not alter my graph in any case. Thats because the code shared by the authors did not include the covariates adjustment.

            Best,
            Cat


            Code:
            *Original Plot:
            *rdplot dv running if nat==1 & tec==1, c(1990) all ///
            *kernel(tri) covs(c1 c2 c3) weights(w)   ///
            *p(1) graph_options(aspectratio(1) legend(off))
            
            
            global y dv
            global x running
            global c 1990
            sum $x
            global x_min = r(min)
            global x_max = r(max)
            
            *Remove covars, include genvars and hide:
            rdplot dv running if nat==1 & tec==1, c(1990) all ///
            kernel(tri) weights(w)   ///
            p(1) graph_options(aspectratio(1) legend(off)) genvars hide
            
            *For the rug plot:
            sum dv, meanonly
            gen ya = r(min) - 0.04 * (r(max) - r(min))
            
            
            *Final result:
            twoway (scatter rdplot_mean_y rdplot_mean_bin, sort msize(small)  mcolor(gs10)) ///
            (function `e(eq_l)', range($x_min $c) lcolor(black) sort lwidth(medthin) lpattern(solid)) ///
            (function `e(eq_r)', range($c $x_max) lcolor(black) sort lwidth(medthin) lpattern(solid)), ///
            (scatter ya rdplot_mean_bin, ms(|) ), ///
            xline($c, lcolor(black) lwidth(medthin)) xscale(r($x_min $x_max))  /// 
            legend(off)  aspectratio(1)

            Comment


            • #7
              Here's another way to add a rug, which may be easier and is quite general.

              Code:
              . sysuse auto
              (1978 automobile data)
              
              . levelsof weight, local(levels)
              1760 1800 1830 1930 1980 1990 2020 2040 2050 2070 2110 2120 2130 2160 2200 2230 2240 2280 2370 2410 2520 2580 2640 2650 2670 2690 2730 2750 2830 2930 3170 3180 3200 3210 3
              > 220 3250 3260 3280 3300 3310 3330 3350 3370 3400 3420 3430 3470 3600 3670 3690 3700 3720 3740 3830 3880 3900 4030 4060 4080 4130 4290 4330 4720 4840
              
              
              . scatter mpg weight, xtick(`levels', tpos(inside) tlength(*2) tlc(red))
              The
              Code:
              tlength(*2) tlc(stc2)
              suboption calls are indicative, not definitive, and intended just to signal how much can be tuned.


              Click image for larger version

Name:	scatter_rug.png
Views:	1
Size:	44.3 KB
ID:	1767758

              Comment

              Working...
              X