Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hypotesis testing, one coefficient larger then the other

    Hello everyone,

    I would need some help with a hypotesis testing in stata. I would need to test if one coefficient in my pooled OLS regression is larger then another coefficient.
    I.e. if B1 > B2

    I know how to perform F- and t-tests with the hypothesis if the are equal (B1 >=B2), or zero (B1 - B2 =0). But I need to know how we can test if the difference between them is above zero (meaning that B1 > B2), or if we can test that one coefficient is larger then another.

    How do I set up this test?

    Appreciate your help,
    Camilla

  • #2
    Camilla:
    welcome to the list.
    You may be interested in this (quite recently) updated Stata thread: http://www.stata.com/support/faqs/st...-coefficients/ from which the following toy-example is inspired:
    Code:
    . use "C:\Program Files (x86)\Stata14\ado\base\a\auto.dta", clear
    (1978 Automobile Data)
    
    . reg price weight mpg
    
          Source |       SS           df       MS      Number of obs   =        74
    -------------+----------------------------------   F(2, 71)        =     14.74
           Model |   186321280         2  93160639.9   Prob > F        =    0.0000
        Residual |   448744116        71  6320339.67   R-squared       =    0.2934
    -------------+----------------------------------   Adj R-squared   =    0.2735
           Total |   635065396        73  8699525.97   Root MSE        =      2514
    
    ------------------------------------------------------------------------------
           price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          weight |   1.746559   .6413538     2.72   0.008      .467736    3.025382
             mpg |  -49.51222   86.15604    -0.57   0.567    -221.3025     122.278
           _cons |   1946.069    3597.05     0.54   0.590    -5226.245    9118.382
    ------------------------------------------------------------------------------
    
    . test weigh mpg
    
     ( 1)  weight = 0
     ( 2)  mpg = 0
    
           F(  2,    71) =   14.74
                Prob > F =    0.0000
    
    . test weight-mpg=0
    
     ( 1)  weight - mpg = 0
    
           F(  1,    71) =    0.36
                Prob > F =    0.5514
    
    . local sign_car = sign(_b[weight]-_b[mpg])
    
    . display "H_0: weight coef >= mpg coef. p-value = " normal(`sign_car'*sqrt(r(F)))
    H_0: weight coef >= mpg coef. p-value = .72526132
    Last edited by Carlo Lazzaro; 05 May 2016, 08:04.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Hello everyone,
      I have a question that is related to the previous argument.

      I run the following Pooled OLS regression using Stata 14.2:
      Code:
      regress Y   L.nopattm L.onlytm L.onlypat L.pattm L.X1 L.X2 i.X4 X5 X6 X7 i.year,noconstant  vce(robust)
      where nopattm, onlypat, onlytm and pattm are dummy variables identifying exclusive combinations of patents and trademarks; Y is a measure for firm's performance and Xs are control variables.

      I need to test if performance is higher if the firm uses IP right protection. Thus, I would test the following hypotheses (where b1, b2, b3 and b4 are the estimated coefficients for the four dummy variables)
      1) H0: b4>b1

      2) H0: b2>b1

      3) H0: b3>b1

      I would test if performance ishigher if the firm chooses to use both patents and a trademarks than only one type of IP right
      4)H0: b4>b2

      5)Ho: b4>b3

      If I have understood correctly, I should follow the suggestion posted by Carlo Lazzaro in order to perform one side t tests. For example I should use the following code in order to test the hypothesis number 4

      Code:
       
      test L.onlytm L.pattm
       
       ( 1)  L.onlytm = 0
       ( 2)  L.pattm_d = 0
       
             F(  2,686216) =45442.44
                  Prob > F =    0.0000
       
      . test L.onlytm -L.pattm=0
       
       ( 1)  L.onlytm - L.pattm_d = 0
       
             F(  1,686216) =    7.96
                  Prob > F =    0.0048
       
      . local sign_car = sign(_b[L.pattm]-_b[L.onlytm])
       
      . display "H_0: PAT TM coef >=  TM coef. p-value = " normal(`sign_car'*sqrt(r(F)))
      H_0: PAT TM coef >=  TM coef. p-value = .99760379

      Tests suggest that: b4 and b2 are not jointly equal to zero; are not equal and b4>b2
      Is this the properly interpretation?


      Furthermore, I would test if adding an activity (i.e pat) while the other activity (i.e tm) is already being performed has a higher incremental effect on performance than adding the activity (pat) in isolation. Thus I need to test the following hypothesis:

      6) b4-b2>b3-b1

      Can I follow the same approach as above and use the following code to test this hypothesis?

      Code:
      test L.nopattm L.onlytm L.onlypat L.pattm
       
       ( 1)  L.nopattm = 0
       ( 2)  L.onlytm = 0
       ( 3)  L.onlypat = 0
       ( 4)  L.pattm_d = 0
       
             F(  4,686216) =25374.94
                  Prob > F =    0.0000
       
      .
      . test  L.nopattm- L.onlytm- L.onlypat- L.pattm=0
       
       ( 1)  L.nopattm - L.onlytm - L.onlypat - L.pattm_d = 0
       
             F(  1,686216) =82696.42
                  Prob > F =    0.0000
       
      .
      . local sign_ip = sign(_b[L.pattm]-_b[L.onlytm]-_b[L.onlypat]+_b[L.nopattm])
       
      .
      . display "H_0: pattm-onytm >onlypat-nopattm p-value = " normal(`sign_ip'*sqrt(r(F)))
      H_0: pattm-onytm >onlypat-nopattm p-value = 0
      Is it still a one-side t test?
      The four coefficients are not jointly equal to zero, are not equal and the test rejects the H0: b4-b2>b3-b1. Is this interpretation right?
      How can I obtain the value of the t statistic?

      I thank you all in advance for you help.

      Chiara

      Comment


      • #4
        Originally posted by Carlo Lazzaro View Post
        Camilla:
        welcome to the list.
        You may be interested in this (quite recently) updated Stata thread: http://www.stata.com/support/faqs/st...-coefficients/ from which the following toy-example is inspired:
        Code:
        . use "C:\Program Files (x86)\Stata14\ado\base\a\auto.dta", clear
        (1978 Automobile Data)
        
        . reg price weight mpg
        
        Source | SS df MS Number of obs = 74
        -------------+---------------------------------- F(2, 71) = 14.74
        Model | 186321280 2 93160639.9 Prob > F = 0.0000
        Residual | 448744116 71 6320339.67 R-squared = 0.2934
        -------------+---------------------------------- Adj R-squared = 0.2735
        Total | 635065396 73 8699525.97 Root MSE = 2514
        
        ------------------------------------------------------------------------------
        price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        weight | 1.746559 .6413538 2.72 0.008 .467736 3.025382
        mpg | -49.51222 86.15604 -0.57 0.567 -221.3025 122.278
        _cons | 1946.069 3597.05 0.54 0.590 -5226.245 9118.382
        ------------------------------------------------------------------------------
        
        . test weigh mpg
        
        ( 1) weight = 0
        ( 2) mpg = 0
        
        F( 2, 71) = 14.74
        Prob > F = 0.0000
        
        . test weight-mpg=0
        
        ( 1) weight - mpg = 0
        
        F( 1, 71) = 0.36
        Prob > F = 0.5514
        
        . local sign_car = sign(_b[weight]-_b[mpg])
        
        . display "H_0: weight coef >= mpg coef. p-value = " normal(`sign_car'*sqrt(r(F)))
        H_0: weight coef >= mpg coef. p-value = .72526132
        For other statalisters to benefit,

        Be careful that Carlo Lazzaro uses F distribution (normal(`sign_ip'*sqrt(r(F)))) instead of chisquare distribution (normal(`sign_ag'*sqrt(r(chi2)))) that is suggested in the link he shared. You can get the unmeaningful p values like "p-value = ." and confuse yourself if you use F distribution.

        Comment


        • #5
          I think that Bora's comment is correct when t-statistics are replaced by z-statistics (http://www.stata.com/support/faqs/st...-coefficients/):
          Code:
          . webuse union, clear
          (NLS Women 14-24 in 1968)
          
          . xtset id
                 panel variable:  idcode (unbalanced)
          
          . xtlogit union age grade, nolog
          
          Random-effects logistic regression              Number of obs     =     26,200
          Group variable: idcode                          Number of groups  =      4,434
          
          Random effects u_i ~ Gaussian                   Obs per group:
                                                                        min =          1
                                                                        avg =        5.9
                                                                        max =         12
          
          Integration method: mvaghermite                 Integration pts.  =         12
          
                                                          Wald chi2(2)      =      69.07
          Log likelihood  = -10623.006                    Prob > chi2       =     0.0000
          
          ------------------------------------------------------------------------------
                 union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   age |   .0149252   .0036634     4.07   0.000      .007745    .0221055
                 grade |   .1130089   .0177088     6.38   0.000     .0783002    .1477175
                 _cons |  -4.313313   .2426307   -17.78   0.000    -4.788861   -3.837766
          -------------+----------------------------------------------------------------
              /lnsig2u |   1.800793    .046732                        1.7092    1.892386
          -------------+----------------------------------------------------------------
               sigma_u |   2.460578   .0574939                      2.350434    2.575884
                   rho |   .6479283   .0106604                      .6267624    .6685287
          ------------------------------------------------------------------------------
          LR test of rho=0: chibar2(01) = 6343.94                Prob >= chibar2 = 0.000
          
          . test grade
          
           ( 1)  [union]grade = 0
          
                     chi2(  1) =   40.72
                   Prob > chi2 =    0.0000
          
          . local sign_grade = sign(_b[grade])
          
          . display "H_0: coef<=0  p-value = " 1-normal(`sign_grade'*sqrt(r(chi2)))
          H_0: coef<=0  p-value = 8.768e-11
          
          . display "H_0: coef>=0  p-value = " normal(`sign_grade'*sqrt(r(chi2)))
          H_0: coef>=0  p-value = 1
          
          . test age-grade = 0
          
           ( 1)  [union]age - [union]grade = 0
          
                     chi2(  1) =   27.44
                   Prob > chi2 =    0.0000
          
          . local sign_ag = sign(_b[age]-_b[grade])
          
          . display "H_0: age coef >= grade coef. p-value = " normal(`sign_ag'*sqrt(r(chi2)))
          H_0: age coef >= grade coef. p-value = 8.112e-08
          
          .
          Kind regards,
          Carlo
          (StataNow 18.5)

          Comment


          • #6
            I don't know if it's possible to revive this old discussion of doing one-side tests of inequality between coefficients. I want to use the approach described here, but I'm not sure why the test is framed with the focal inequality as the null hypothesis. As a result, the p-value is coded the opposite of what I'm used to. I would have expected 1- the number reported here. Is there a good reason for this? Thank you!

            Comment

            Working...
            X