Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Correct wording margins, pwcompare average marginal effects, regressions

    Dear Statalist,

    My question concerns the correct wording when discussing results from regressions, margins, average marginal effects, and pwcompare.

    In a simple OLS regression, I would say: the treatment condition (6 versus 3 years) is associated with an insignificant increase in action of 3.5 percentage points. Other variations are "increases the probability" and "participants after 6 years are more 3.5 pp more likely to be active compared to 3 years" [I know that I have insignificance here, I'm more worried about the correct wording].

    Code:
    Linear regression                               Number of obs     =      2,701
                                                    F(9, 780)         =       3.49
                                                    Prob > F          =     0.0003
                                                    R-squared         =     0.0189
                                                    Root MSE          =     .47867
    
                                      (Std. err. adjusted for 781 clusters in CASE)
    -------------------------------------------------------------------------------
                  |               Robust
           action | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    --------------+----------------------------------------------------------------
              T_C |
         6 years  |    .034639   .0277881     1.25   0.213    -.0199094    .0891874
                  |
           gender |
          female  |  -.1441114   .0360066    -4.00   0.000    -.2147928   -.0734301
      non-binary  |  -.2839578   .1198681    -2.37   0.018    -.5192601   -.0486554
      not stated  |   .0824881   .1462172     0.56   0.573    -.2045377    .3695138
                  |
           device |
          Tablet  |  -.0058154   .1186563    -0.05   0.961    -.2387389    .2271082
      Smartphone  |  -.0275835   .0384659    -0.72   0.474    -.1030924    .0479255
                  |
           system |
         Android  |   .0211903   .0458835     0.46   0.644    -.0688795    .1112601
           Apple  |   .0280029   .1449563     0.19   0.847    -.2565478    .3125536
                  |
    1.instruclick |   .0801797   .0296241     2.71   0.007     .0220274    .1383321
            _cons |   .3630647   .0252171    14.40   0.000     .3135633    .4125661
    -------------------------------------------------------------------------------

    Now, including the interaction between gender and treatment condition:

    Code:
    Linear regression                               Number of obs     =      2,701
                                                    F(11, 780)        =          .
                                                    Prob > F          =          .
                                                    R-squared         =     0.0196
                                                    Root MSE          =     .47876
    
                                            (Std. err. adjusted for 781 clusters in CASE)
    -------------------------------------------------------------------------------------
                        |               Robust
                 action | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    --------------------+----------------------------------------------------------------
                    T_C |
               6 years  |   .0364919   .0309036     1.18   0.238    -.0241722     .097156
                        |
                 gender |
                female  |  -.1440245   .0463511    -3.11   0.002    -.2350122   -.0530368
            non-binary  |   .1276467   .0384864     3.32   0.001     .0520976    .2031958
            not stated  |   .1025048   .1851297     0.55   0.580    -.2609067    .4659163
                        |
             T_C#gender |
        6 years#female  |   -.000118   .0725357    -0.00   0.999    -.1425063    .1422702
    6 years#non-binary  |  -.5328814   .0692998    -7.69   0.000    -.6689177   -.3968451
    6 years#not stated  |  -.0520135   .3003779    -0.17   0.863    -.6416584    .5376314
                        |
                 device |
                Tablet  |  -.0019922   .1183719    -0.02   0.987    -.2343574    .2303729
            Smartphone  |  -.0295926   .0386401    -0.77   0.444    -.1054436    .0462583
                        |
                 system |
               Android  |   .0241251   .0460757     0.52   0.601    -.0663219    .1145721
                 Apple  |   .0238317   .1445695     0.16   0.869    -.2599597    .3076231
                        |
          1.instruclick |   .0791287    .029652     2.67   0.008     .0209215     .137336
                  _cons |   .3623816   .0261706    13.85   0.000     .3110084    .4137548
    -------------------------------------------------------------------------------------

    Now, with average marginal effects via -margins T_C, dydx(gender) after the interaction, where 1.gender is male and 2.gender is female: I would still say Women after 3 years are 14 percentage points less likely to perform an action compared to men after the same time frame. Is this correct?

    Now my Q: do I need to say predicted in every sentence - Women after 3 years are predicted to be 14 percentage points less active than men after the same time frame? In Stata guides, the authors said the word predicted, but they only talked about one result (https://stats.oarc.ucla.edu/stata/se...actions-stata/), whereas I would like to make more than one comparison. I'm afraid all my sentences will be very hard to read, especially in my results section.


    Code:
    Average marginal effects                                 Number of obs = 2,701
    Model VCE: Robust
    
    Expression: Linear prediction, predict()
    dy/dx wrt:  2.gender 3.gender 4.gender
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
    1.gender     |  (base outcome)
    -------------+----------------------------------------------------------------
    2.gender     |
             T_C |
        3 years  |  -.1440245   .0463511    -3.11   0.002    -.2350122   -.0530368
        6 years  |  -.1441426   .0561392    -2.57   0.010    -.2543444   -.0339407
    -------------+----------------------------------------------------------------
    3.gender     |
             T_C |
        3 years  |   .1276467   .0384864     3.32   0.001     .0520976    .2031958
        6 years  |  -.4052347   .0530736    -7.64   0.000    -.5094188   -.3010507
    -------------+----------------------------------------------------------------
    4.gender     |
             T_C |
        3 years  |   .1025048   .1851297     0.55   0.580    -.2609067    .4659163
        6 years  |   .0504913   .2372121     0.21   0.831    -.4151584     .516141
    ------------------------------------------------------------------------------
    For completeness, the -pwcompare (asobserved) command, because I seek to understand what actually happened in my data compared to if the data were equal. This command also states that these are marginal linear predictions.

    Code:
    Pairwise comparisons of marginal linear predictions
    
    Margins: asobserved
    
    -------------------------------------------------------------------------------------------------------
                                          |                            Unadjusted           Unadjusted
                                          |   Contrast   Std. err.      t    P>|t|     [95% conf. interval]
    --------------------------------------+----------------------------------------------------------------
                               gender#T_C |
        (male#6 years) vs (male#3 years)  |   .0364919   .0309036     1.18   0.238    -.0241722     .097156
      (female#3 years) vs (male#3 years)  |  -.1440245   .0463511    -3.11   0.002    -.2350122   -.0530368
      (female#6 years) vs (male#3 years)  |  -.1076507   .0562544    -1.91   0.056    -.2180786    .0027773
      (female#3 years) vs (male#6 years)  |  -.1805164   .0459586    -3.93   0.000    -.2707336   -.0902992
      (female#6 years) vs (male#6 years)  |  -.1441426   .0561392    -2.57   0.010    -.2543444   -.0339407
    (female#6 years) vs (female#3 years)  |   .0363739   .0654454     0.56   0.579    -.0920961    .1648438
    -------------------------------------------------------------------------------------------------------
    Of course, I will post proper results tables in the paper such that it will be evident how I arrived at the results.

    My question concerns only the proper wording. In which situations can I talk about probability, and where do I need to state predictions explicitly? From the interaction regression, taking T_C 6 years and adding that to T_C#gender 6 years#female will give me the (female#6 years) vs (female#3 years) coefficient, which (to me) indicates that all these analyses are very interlinked and which also can be obtained via:

    Code:
    Average marginal effects                                 Number of obs = 2,701
    Model VCE: Robust
    
    Expression: Linear prediction, predict()
    dy/dx wrt:  1.T_C
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
    0.T_C        |  (base outcome)
    -------------+----------------------------------------------------------------
    1.T_C        |
          gender |
           male  |   .0364919   .0309036     1.18   0.238    -.0241722     .097156
         female  |   .0363739   .0654454     0.56   0.579    -.0920961    .1648438
     non-binary  |  -.4963895   .0628739    -7.89   0.000    -.6198117   -.3729673
     not stated  |  -.0155216    .298784    -0.05   0.959    -.6020374    .5709943
    ------------------------------------------------------------------------------
    Note: dy/dx for factor levels is the discrete change from the base level.
    Thank you very much in advance!



  • #2
    Can you please post the model and margins code that these results tables are associated with?

    Comment


    • #3
      I think I've done that, haven't I? My question centers on the correct wording of the obtained results; whether I need to state predicted before every result if it comes from pwcompare or marginal average effects.

      Comment


      • #4
        You posted results in code blocks, which is great, but then the commands you ran to get those results are not clearly shown. Ideally, you would do so in the same code block as the results.

        First question, why do you talk about OLS regression results in terms of "percentage points" and "increases in the probability?" What is the outcome you are predicting? Is it a 0/1 variable or a continuous variable? Please provide more information about the dependent variable.

        Comment


        • #5
          Hi Erik, Thank you so much for your reply! My apologies for posting the incomplete code!

          I have attached a similar code, my questions remain, and I would be most grateful for any advice.

          My dependent variable is binary, whether participants take action (coded 1) or not (coded 0) - using regress makes this an LPM. In this regression, all independent variables are categorical; T_C is the treatment condition (so treatment or control), gender (female vs male), device (smartphone versus desktop), system (Android versus Apple), and instructions (whether they were consulted).

          Code:
          regress action i.T_C i.gender i.device i.system i.instruclick, vce(cluster CASE)
          
          Linear regression                               Number of obs     =        280
                                                          F(5, 75)          =       2.26
                                                          Prob > F          =     0.0574
                                                          R-squared         =     0.0416
                                                          Root MSE          =     .38767
          
                                             (Std. err. adjusted for 76 clusters in CASE)
          -------------------------------------------------------------------------------
                        |               Robust
                 action | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          --------------+----------------------------------------------------------------
                    T_C |
               6 years  |  -.0045512    .065247    -0.07   0.945      -.13453    .1254275
                        |
                 gender |
                female  |  -.1176268   .0642457    -1.83   0.071    -.2456108    .0103571
                        |
                 device |
            Smartphone  |  -.0492291   .0865668    -0.57   0.571     -.221679    .1232207
                        |
                 system |
               Android  |   .2029163   .1058603     1.92   0.059    -.0079681    .4138008
          1.instruclick |  -.0511101   .0542164    -0.94   0.349    -.1591148    .0568946
                  _cons |    .243605   .0734952     3.31   0.001     .0971952    .3900149
          -------------------------------------------------------------------------------
          Now the interaction between T_C and gender:

          Code:
          
          . regress action i.T_C##i.gender i.device i.system i.instruclick, vce(cluster CASE)
          
          Linear regression                               Number of obs     =        280
                                                          F(6, 75)          =       1.91
                                                          Prob > F          =     0.0909
                                                          R-squared         =     0.0417
                                                          Root MSE          =     .38836
          
                                               (Std. err. adjusted for 76 clusters in CASE)
          ---------------------------------------------------------------------------------
                          |               Robust
                   action | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          ----------------+----------------------------------------------------------------
                      T_C |
                 6 years  |   .0030338   .0952911     0.03   0.975    -.1867958    .1928634
                          |
                   gender |
                  female  |  -.1084788   .0983237    -1.10   0.273    -.3043497     .087392
                          |
               T_C#gender |
          6 years#female  |  -.0183678   .1311875    -0.14   0.889    -.2797067     .242971
                          |
                   device |
              Smartphone  |  -.0472377   .0896108    -0.53   0.600    -.2257516    .1312762
                          |
                   system |
                 Android  |   .2005144    .108331     1.85   0.068     -.015292    .4163208
            1.instruclick |  -.0514302   .0545683    -0.94   0.349    -.1601357    .0572754
                    _cons |   .2388493   .0887248     2.69   0.009     .0621003    .4155982
          ---------------------------------------------------------------------------------
          Now the average marginal effects:

          Code:
          . margins T_C, dydx(gender)
          
          Average marginal effects                                   Number of obs = 280
          Model VCE: Robust
          
          Expression: Linear prediction, predict()
          dy/dx wrt:  2.gender
          
          ------------------------------------------------------------------------------
                       |            Delta-method
                       |      dy/dx   std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
          1.gender     |  (base outcome)
          -------------+----------------------------------------------------------------
          2.gender     |
                   T_C |
              3 years  |  -.1084788   .0983237    -1.10   0.273    -.3043497     .087392
              6 years  |  -.1268467    .084893    -1.49   0.139    -.2959622    .0422689
          ------------------------------------------------------------------------------
          Note: dy/dx for factor levels is the discrete change from the base level.
          Now the pairwise comparisons:

          Code:
          pwcompare T_C#gender, effects
          
          Pairwise comparisons of marginal linear predictions
          
          Margins: asbalanced
          
          -------------------------------------------------------------------------------------------------------
                                                |                            Unadjusted           Unadjusted
                                                |   Contrast   Std. err.      t    P>|t|     [95% conf. interval]
          --------------------------------------+----------------------------------------------------------------
                                     T_C#gender |
            (3 years#female) vs (3 years#male)  |  -.1084788   .0983237    -1.10   0.273    -.3043497     .087392
              (6 years#male) vs (3 years#male)  |   .0030338   .0952911     0.03   0.975    -.1867958    .1928634
            (6 years#female) vs (3 years#male)  |  -.1238129   .1051052    -1.18   0.243    -.3331931    .0855674
            (6 years#male) vs (3 years#female)  |   .1115127   .0723448     1.54   0.127    -.0326057     .255631
          (6 years#female) vs (3 years#female)  |   -.015334   .0868117    -0.18   0.860    -.1882717    .1576037
            (6 years#female) vs (6 years#male)  |  -.1268467    .084893    -1.49   0.139    -.2959622    .0422689
          -------------------------------------------------------------------------------------------------------
          And lastly margins with respect to the treatment condition:

          Code:
          . margins gender, dydx(T_C)
          
          Average marginal effects                                   Number of obs = 280
          Model VCE: Robust
          
          Expression: Linear prediction, predict()
          dy/dx wrt:  1.T_C
          
          ------------------------------------------------------------------------------
                       |            Delta-method
                       |      dy/dx   std. err.      t    P>|t|     [95% conf. interval]
          -------------+----------------------------------------------------------------
          0.T_C        |  (base outcome)
          -------------+----------------------------------------------------------------
          1.T_C        |
                gender |
                 male  |   .0030338   .0952911     0.03   0.975    -.1867958    .1928634
               female  |   -.015334   .0868117    -0.18   0.860    -.1882717    .1576037
          ------------------------------------------------------------------------------
          I would be most grateful if anyone could comment onwhich situations I can talk about probability and where I need to state predictions explicitly.
          I.e., in the pairwise comparisons, first line: are women in the same 3-year time frame as men 11 percentage points less likely to take action [although insignificant], or is it women are predicted to be 11 percentage points less likely to take action versus men in the same time frame of 3 years. I know I have insignificance in this example - I would like to understand the general principle.

          How important is the absolute correct wording? My fear is by including predicted in every sentence in my results section, it might become difficult to read/understand, but of course, I also want to make sure my statements are accurate.

          Comment

          Working...
          X