Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Add text to graphs with by groups (marginsplot example)

    Hello Statalist,

    I would like to add text to a marginsplot with by groups and could not find an option to add different text based on the by group. Here is an example of code to add the text "p-value" to the graphs but what I would like to do is to add the actual p-values to each graph which would be different for girls and boys. Worst case scenario, I can manually do this with the Graph Editor but would ideally like to be able to automate it.

    Code:
    use "http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial.dta", clear
    
    regress standlrt i.girl i.schav
    margins girl#schav
    marginsplot, by(girl) recast(scatter) scheme(s1mono) byopt(title(" ")) ///
                 text(0.4 1.5 "p-value = ", placement(w))
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	15.2 KB
ID:	1397417


  • #2
    Code:
    use "http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial.dta", clear
    regress standlrt ib0.girl i.schav
    local pv: display %06.3fc (2 * ttail(e(df_r), abs(_b[1.girl]/_se[1.girl])))
    margins girl#schav
    marginsplot, by(girl) recast(scatter) scheme(s1mono) byopt(title(" ")) ///
                 text(0.4 2.0 "p-value = `="`pv'"'", placement(w))

    Comment


    • #3
      Thank you for the response, however the example I gave above I think was too simplistic and this wasn't quite what I was looking for. A better example would involve a model with an interaction term since what I am looking for is a way to display different text based on the by group (here, sex). Here is code that builds off what you did. The problem is that the same text is shown for girls and boys when I want to be able to display different p-values for the two groups. The p-values happen to be the same here, but in my real dataset they won't be.

      Code:
      use "http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial.dta", clear    
              
      regress standlrt ib0.girl ib0.girl#c.schav
      
      local pv1: display %06.3fc (2 * ttail(e(df_r), abs(_b[0b.girl#c.schav]/_se[0b.girl#c.schav])))
      local pv2: display %06.3fc (2 * ttail(e(df_r), abs(_b[1.girl#c.schav]/_se[1.girl#c.schav])))
      
      margins girl, at(schav=(1(1)3))
      marginsplot, by(girl) scheme(s1mono) byopt(title(" ")) ///
                   text(0.4 2.5 "p-value (boys) =  `="`pv1'"'", placement(w)) ///
                   text(0.3 2.5 "p-value (girls) =  `="`pv2'"'", placement(w))
      Click image for larger version

Name:	Graph.png
Views:	1
Size:	18.3 KB
ID:	1397461

      Comment


      • #4
        I fail to see how in the binary case, changing base alters the p-value. Come up with an example that depicts your real data.

        Comment


        • #5
          This example could have had different p-values. Perhaps it's confusing because of how I re-parameterized the model, but the first p-value is for the effect of a one unit increase in schav on standlrt among boys and the second is for the effect of a one unit increase in schav on standlrt among girls. Here are the regression model results to show that these are two different test statistics (they just happen to be both highly significant in this dataset).

          HTML Code:
                Source |       SS       df       MS              Number of obs =    4059
          -------------+------------------------------           F(  3,  4055) =  125.07
                 Model |  339.044571     3  113.014857           Prob > F      =  0.0000
              Residual |  3664.14845  4055  .903612441           R-squared     =  0.0847
          -------------+------------------------------           Adj R-squared =  0.0840
                 Total |  4003.19302  4058  .986494091           Root MSE      =  .95059
          
          ------------------------------------------------------------------------------
              standlrt |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                1.girl |   .1924711   .1033071     1.86   0.063    -.0100676    .3950098
                       |
          girl#c.schav |
                    0  |   .4742856   .0364683    13.01   0.000     .4027876    .5457835
                    1  |   .4104038   .0295014    13.91   0.000     .3525649    .4682427
                       |
                 _cons |  -1.039263   .0786921   -13.21   0.000    -1.193542   -.8849829
          ------------------------------------------------------------------------------
          I can't share my own data because it's confidential, but let's say instead I want the beta estimate associated with schav for boys and girls which is a different number. I think it should be straightforward to extrapolate to my real data if there is just a way to display different text by sex which is what I really can't figure out.

          Code:
          use "http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial.dta", clear    
                  
          regress standlrt ib0.girl ib0.girl#c.schav
          
          local beta1: display %06.3fc _b[0b.girl#c.schav]
          local beta2: display %06.3fc _b[1.girl#c.schav]
          
          margins girl, at(schav=(1(1)3))
          marginsplot, by(girl) scheme(s1mono) byopt(title(" ")) ///
                       text(0.4 2.5 "beta (boys) =  `="`beta1'"'", placement(w)) ///
                       text(0.3 2.5 "beta (girls) =  `="`beta2'"'", placement(w))
          Click image for larger version

Name:	Graph.png
Views:	1
Size:	18.2 KB
ID:	1397464

          Comment


          • #6
            Unless you are talking about 2 different models, changing the base with a binary variable does not alter the p-value - only the point of reference. That said, I see that your question is how to have text appear in one sub graph and not the other using the by() option in marginsplot. I do not know of any options that apply to the specific sub graphs. However, you can use

            Code:
             
             marginsplot, gr(girl) scheme(s1mono)
            which plots the graphs separately and then combine these using graph combine.

            Comment


            • #7
              Thank you Andrew, it sounds like there isn't an ideal solution that exists. Perhaps I'm missing something, but it seems that plotting the graphs separately as you showed would have the same issue? I don't see how you could use that code to write group-specific commands.

              Comment


              • #8
                Also, feel free to ignore this because it is not relevant to the question at hand or specific to Stata, but I agree that changing a reference/base group will not change p-values but that is not actually what I did above. There is a fun trick that one can do with interaction terms to re-parameterize them and directly get stratified effect estimates. I'm not sure it is widely known or used, but I have seen it come up in a few courses I have taken. I've shown the regression models behind this below:

                The usual interaction regression model (where X and Z are some binary variables):
                Y=b0+b1Z+b2X+b3XZ

                b2 is the effect of X on Y when Z=0
                b2+b3 is the effect of X on Y when Z=1


                A re-parameterized interaction regression model:
                Y=b0+b1Z+b2X(1-Z)+b3XZ

                b2 is the effect of X on Y when Z=0
                b3 is the effect of X on Y when Z=1

                This is convenient because one doesn't have to add multiple beta coefficients to get stratified estimates and work through a formula to calculate their combined p-value/confidence interval, it is directly output from Stata. This is basically what the model I ran above did. I think you would agree in the second model that b2 and b3 are different estimates with different test statistics/p-values, etc.

                Comment


                • #9
                  Thanks for the clarification. The advice in #6 is doable, just maybe a bit time consuming. From your example in #5

                  Code:
                  use "http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial.dta", clear    
                          
                  regress standlrt ib0.girl ib0.girl#c.schav
                  
                  local beta1: display %06.3fc _b[0b.girl#c.schav]
                  local beta2: display %06.3fc _b[1.girl#c.schav]
                  
                  margins girl, at(schav=(1(1)3))
                  marginsplot, gr(girl) scheme(s1mono) title(" ") ytitle(" ") xtitle(" ") ///
                  text(0.4 2.5 "beta (boys) =  `="`beta1'"'", placement(w))
                  gr save _mp_1
                  
                  marginsplot, gr(girl) scheme(s1mono) title(" ") ytitle(" ") xtitle(" ") ///
                  text(0.4 2.5 "beta (girls) =  `="`beta2'"'", placement(w))
                  
                  gr save _mp_2
                  
                  gr combine  _mp_1.gph  _mp_2.gph, ycommon xcommon l1title("Linear Prediction")///
                  b1title("School Average LRT Score (3 categories)")


                  Click image for larger version

Name:	gr_comb.png
Views:	1
Size:	28.7 KB
ID:	1397644

                  Comment


                  • #10
                    That is a clever solution, thank you!

                    Comment

                    Working...
                    X