Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • OLS dummies insignificant; testparm significant - how to proceed?

    Dear Statalist,

    Thank you very much for all your previous help in my project.

    Using dummies in my OLS regression, I found no significance in my dependent variable when I compared my three categories of risk to the first risk category. Next, I ran the testparm command to see whether the risk variable overall has a significance on my dependent variable - via
    Code:
    testparm i.risk
    I obtained significance at the 10% level (Prob > F = 0.0591)

    Code:
    Linear regression                    Number of obs     =        263
                                                    F(17, 75)         =       1.26
                                                    Prob > F          =     0.2411
                                                    R-squared         =     0.0453
                                                    Root MSE          =     .39724
    
                                      (Std. err. adjusted for 76 clusters in CASE)
    ------------------------------------------------------------------------------
                 |               Robust
    behavior2 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             T_C |
        6 years  |   .0264162   .0632025     0.42   0.677    -.0994895     .152322
                 |
     return |
           2021  |   .0442747   .1048373     0.42   0.674    -.1645719    .2531213
           2020  |  -.0526893   .1108799    -0.48   0.636    -.2735734    .1681948
           2019  |  -.0224256   .1130914    -0.20   0.843    -.2477151     .202864
           2018  |  -.0416401   .1203362    -0.35   0.730    -.2813621     .198082
           2017  |  -.1349324   .0985699    -1.37   0.175    -.3312936    .0614289
           2016  |    .004787    .121219     0.04   0.969    -.2366937    .2462677
           2015  |  -.0362843   .1099075    -0.33   0.742    -.2552313    .1826627
           2014  |  -.0826103   .1086131    -0.76   0.449    -.2989786     .133758
           2013  |  -.0993647   .0987831    -1.01   0.318    -.2961506    .0974213
                 |
            risk |
              2  |   .0736035   .0603799     1.22   0.227    -.0466794    .1938864
              3  |  -.0899614   .0614228    -1.46   0.147    -.2123218     .032399
              4  |  -.0505715   .0697404    -0.73   0.471    -.1895014    .0883585
                 |
           round |
              2  |  -.0419953   .0683583    -0.61   0.541     -.178172    .0941815
              3  |  -.0434012   .0574002    -0.76   0.452    -.1577483    .0709459
              4  |  -.0764311   .0716531    -1.07   0.290    -.2191714    .0663092
                 |
        1.male |   .0250311   .0675967     0.37   0.712    -.1096285    .1596906
           _cons |   .2637478   .0888547     2.97   0.004     .0867402    .4407555
    ------------------------------------------------------------------------------
    Code:
     testparm i.risk
    
     ( 1)  2.risk = 0
     ( 2)  3.risk = 0
     ( 3)  4.risk = 0
    
           F(  3,    75) =    2.59
                Prob > F =    0.0591

    Which analyses can I use to narrow down the effect? I assume the effect is somewhere in the first risk category. I am most grateful for any advice.

  • #2
    -testparm- is addressing the question of homogeneity among risk levels. There's some evidence to suggest that not all risk levels are equal. This is your "overall" test if you will.

    This is a different questions and hypothesis to those related to the p-values reported in the regression table next to each risk. These p-values are simple contrasts compared to level 1. Is there something special about level 1 that would lead you expect some differences with it?

    At a glance, it seems that levels 2 and 3 may be different, but that's probably about it. Another useful way to summarize this is to use margins to see how the risk levels are associated with the outcome.

    Code:
    margins i.risk

    Comment


    • #3
      In addition to Leonardo's suggestion to use margins, you can also use the contrast command to get tests of contrasts for categorical variables after fitting a model. For example, you can get the test of varying categories to one another using the a. operator, the ar. operator, and the rb#. operator:
      Code:
      webuse nlswork, clear
      recode birth_yr (41/46 = 0) (47/50 = 1) (51/54 = 2), gen(birth_cat)
      reg ln_w tenure i.union i.birth_cat
      contrast a.birth_cat, effects
      In the regression, you get tests of whether the association differs for each of the non-reference categories as compared to the reference category. The output of the contrast command additionally gives you tests for the adjacent categories:
      Code:
      ------------------------------------------------
                   |         df           F        P>F
      -------------+----------------------------------
         birth_cat |
         (0 vs 1)  |          1        3.12     0.0775
         (1 vs 2)  |          1       15.49     0.0001
            Joint  |          2        7.75     0.0004
                   |
       Denominator |      19005
      ------------------------------------------------
      
      ------------------------------------------------------------------------------
                   |   Contrast   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
         birth_cat |
         (0 vs 1)  |  -.0131724   .0074614    -1.77   0.078    -.0277975    .0014527
         (1 vs 2)  |   .0304255   .0077302     3.94   0.000     .0152737    .0455773
      ------------------------------------------------------------------------------
      Last edited by Erik Ruzek; 07 Feb 2024, 13:58. Reason: Fixed grammer

      Comment


      • #4
        Thank you Leonardo! Is there any way to specify the evidence that not all risk levels are equal from testparm's F-test?
        As far as margins go, isn't that just a linear prediction, where the p-value does not have any real-life implications as the hypotheses test equal to 0? Therefore, aren't average marginal effects preferred?

        Comment


        • #5
          Thank you Erik! Will look into this!

          Comment


          • #6
            Originally posted by Scott Forrester View Post
            Thank you Leonardo! Is there any way to specify the evidence that not all risk levels are equal from testparm's F-test?
            I don't know what you mean by "specify the evidence". -testparm- has given you a test on this, but this only tests whether all 4 risk levels should have the same association (not necessarily equal to zero). Yet other tests are possible. You could look at the contrast of the average of levels 1 and 2 vs 3 and 4. You could look at the average of levels 2, 3 and 4 compared to level 1, to give 2 ideas. You have to clarify what is meaningful in your study, as these tests may be meaningless in the context of your work.

            Originally posted by Scott Forrester View Post
            As far as margins go, isn't that just a linear prediction, where the p-value does not have any real-life implications as the hypotheses test equal to 0? Therefore, aren't average marginal effects preferred?
            Yes, the margins command I gave is for (adjusted) marginal predictions. The margins I showed in #2 might be useful for you to see what your average outcome. Erik showed one way to get a average marginal effect. I wouldn't care so much about the p-values, but rather thinking about what is the concept you are trying to demonstrate?

            Comment


            • #7
              Thank you very much Leonardo!

              Comment

              Working...
              X