Hypotesis testing, one coefficient larger then the other

Camilla Andersson

Join Date: May 2016

Posts: 1
#1

Hypotesis testing, one coefficient larger then the other

05 May 2016, 07:13

Hello everyone,

I would need some help with a hypotesis testing in stata. I would need to test if one coefficient in my pooled OLS regression is larger then another coefficient.
I.e. if B₁> B₂

I know how to perform F- and t-tests with the hypothesis if the are equal (B₁>=B₂₎, or zero (B_{1 -} B₂=0). But I need to know how we can test if the difference between them is above zero (meaning that B₁> B₂₎, or if we can test that one coefficient is larger then another.
How do I set up this test?

Appreciate your help,
Camilla
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17647

05 May 2016, 07:48

Camilla:
welcome to the list.
You may be interested in this (quite recently) updated Stata thread: http://www.stata.com/support/faqs/st...-coefficients/ from which the following toy-example is inspired:

Code:

. use "C:\Program Files (x86)\Stata14\ado\base\a\auto.dta", clear
(1978 Automobile Data)

. reg price weight mpg

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 71)        =     14.74
       Model |   186321280         2  93160639.9   Prob > F        =    0.0000
    Residual |   448744116        71  6320339.67   R-squared       =    0.2934
-------------+----------------------------------   Adj R-squared   =    0.2735
       Total |   635065396        73  8699525.97   Root MSE        =      2514

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      weight |   1.746559   .6413538     2.72   0.008      .467736    3.025382
         mpg |  -49.51222   86.15604    -0.57   0.567    -221.3025     122.278
       _cons |   1946.069    3597.05     0.54   0.590    -5226.245    9118.382
------------------------------------------------------------------------------

. test weigh mpg

 ( 1)  weight = 0
 ( 2)  mpg = 0

       F(  2,    71) =   14.74
            Prob > F =    0.0000

. test weight-mpg=0

 ( 1)  weight - mpg = 0

       F(  1,    71) =    0.36
            Prob > F =    0.5514

. local sign_car = sign(_b[weight]-_b[mpg])

. display "H_0: weight coef >= mpg coef. p-value = " normal(`sign_car'*sqrt(r(F)))
H_0: weight coef >= mpg coef. p-value = .72526132

Last edited by Carlo Lazzaro; 05 May 2016, 08:04.

Kind regards,
Carlo
(StataNow 18.5)

Comment

chiara piccardo

Join Date: Mar 2015

Posts: 15
#3

12 Apr 2019, 06:28

Hello everyone,
I have a question that is related to the previous argument.

I run the following Pooled OLS regression using Stata 14.2:

Code:

regress Y L.nopattm L.onlytm L.onlypat L.pattm L.X1 L.X2 i.X4 X5 X6 X7 i.year,noconstant vce(robust)

where nopattm, onlypat, onlytm and pattm are dummy variables identifying exclusive combinations of patents and trademarks; Y is a measure for firm's performance and Xs are control variables.

I need to test if performance is higher if the firm uses IP right protection. Thus, I would test the following hypotheses (where b1, b2, b3 and b4 are the estimated coefficients for the four dummy variables)
1) H0: b4>b1

2) H0: b2>b1

3) H0: b3>b1

I would test if performance ishigher if the firm chooses to use both patents and a trademarks than only one type of IP right
4)H0: b4>b2

5)Ho: b4>b3

If I have understood correctly, I should follow the suggestion posted by Carlo Lazzaro in order to perform one side t tests. For example I should use the following code in order to test the hypothesis number 4

Code:

test L.onlytm L.pattm ( 1) L.onlytm = 0 ( 2) L.pattm_d = 0 F( 2,686216) =45442.44 Prob > F = 0.0000 . test L.onlytm -L.pattm=0 ( 1) L.onlytm - L.pattm_d = 0 F( 1,686216) = 7.96 Prob > F = 0.0048 . local sign_car = sign(_b[L.pattm]-_b[L.onlytm]) . display "H_0: PAT TM coef >= TM coef. p-value = " normal(`sign_car'*sqrt(r(F))) H_0: PAT TM coef >= TM coef. p-value = .99760379

Tests suggest that: b4 and b2 are not jointly equal to zero; are not equal and b4>b2
Is this the properly interpretation?

Furthermore, I would test if adding an activity (i.e pat) while the other activity (i.e tm) is already being performed has a higher incremental effect on performance than adding the activity (pat) in isolation. Thus I need to test the following hypothesis:

6) b4-b2>b3-b1

Can I follow the same approach as above and use the following code to test this hypothesis?

Code:

test L.nopattm L.onlytm L.onlypat L.pattm ( 1) L.nopattm = 0 ( 2) L.onlytm = 0 ( 3) L.onlypat = 0 ( 4) L.pattm_d = 0 F( 4,686216) =25374.94 Prob > F = 0.0000 . . test L.nopattm- L.onlytm- L.onlypat- L.pattm=0 ( 1) L.nopattm - L.onlytm - L.onlypat - L.pattm_d = 0 F( 1,686216) =82696.42 Prob > F = 0.0000 . . local sign_ip = sign(_b[L.pattm]-_b[L.onlytm]-_b[L.onlypat]+_b[L.nopattm]) . . display "H_0: pattm-onytm >onlypat-nopattm p-value = " normal(`sign_ip'*sqrt(r(F))) H_0: pattm-onytm >onlypat-nopattm p-value = 0

Is it still a one-side t test?
The four coefficients are not jointly equal to zero, are not equal and the test rejects the H0: b4-b2>b3-b1. Is this interpretation right?
How can I obtain the value of the t statistic?

I thank you all in advance for you help.

Chiara
Comment

Bora Sinci

Join Date: Jul 2019
Posts: 7

06 Aug 2019, 13:31

Originally posted by Carlo Lazzaro View Post

Code:

. use "C:\Program Files (x86)\Stata14\ado\base\a\auto.dta", clear
(1978 Automobile Data)

. reg price weight mpg

Source | SS df MS Number of obs = 74
-------------+---------------------------------- F(2, 71) = 14.74
Model | 186321280 2 93160639.9 Prob > F = 0.0000
Residual | 448744116 71 6320339.67 R-squared = 0.2934
-------------+---------------------------------- Adj R-squared = 0.2735
Total | 635065396 73 8699525.97 Root MSE = 2514

------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 1.746559 .6413538 2.72 0.008 .467736 3.025382
mpg | -49.51222 86.15604 -0.57 0.567 -221.3025 122.278
_cons | 1946.069 3597.05 0.54 0.590 -5226.245 9118.382
------------------------------------------------------------------------------

. test weigh mpg

( 1) weight = 0
( 2) mpg = 0

F( 2, 71) = 14.74
Prob > F = 0.0000

. test weight-mpg=0

( 1) weight - mpg = 0

F( 1, 71) = 0.36
Prob > F = 0.5514

. local sign_car = sign(_b[weight]-_b[mpg])

. display "H_0: weight coef >= mpg coef. p-value = " normal(`sign_car'*sqrt(r(F)))
H_0: weight coef >= mpg coef. p-value = .72526132

For other statalisters to benefit,

Be careful that Carlo Lazzaro uses F distribution (normal(`sign_ip'*sqrt(r(F)))) instead of chisquare distribution (normal(`sign_ag'*sqrt(r(chi2)))) that is suggested in the link he shared. You can get the unmeaningful p values like "p-value = ." and confuse yourself if you use F distribution.

Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17647

07 Aug 2019, 01:17

I think that Bora's comment is correct when t-statistics are replaced by z-statistics (http://www.stata.com/support/faqs/st...-coefficients/):

Code:

. webuse union, clear
(NLS Women 14-24 in 1968)

. xtset id
       panel variable:  idcode (unbalanced)

. xtlogit union age grade, nolog

Random-effects logistic regression              Number of obs     =     26,200
Group variable: idcode                          Number of groups  =      4,434

Random effects u_i ~ Gaussian                   Obs per group:
                                                              min =          1
                                                              avg =        5.9
                                                              max =         12

Integration method: mvaghermite                 Integration pts.  =         12

                                                Wald chi2(2)      =      69.07
Log likelihood  = -10623.006                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
       union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0149252   .0036634     4.07   0.000      .007745    .0221055
       grade |   .1130089   .0177088     6.38   0.000     .0783002    .1477175
       _cons |  -4.313313   .2426307   -17.78   0.000    -4.788861   -3.837766
-------------+----------------------------------------------------------------
    /lnsig2u |   1.800793    .046732                        1.7092    1.892386
-------------+----------------------------------------------------------------
     sigma_u |   2.460578   .0574939                      2.350434    2.575884
         rho |   .6479283   .0106604                      .6267624    .6685287
------------------------------------------------------------------------------
LR test of rho=0: chibar2(01) = 6343.94                Prob >= chibar2 = 0.000

. test grade

 ( 1)  [union]grade = 0

           chi2(  1) =   40.72
         Prob > chi2 =    0.0000

. local sign_grade = sign(_b[grade])

. display "H_0: coef<=0  p-value = " 1-normal(`sign_grade'*sqrt(r(chi2)))
H_0: coef<=0  p-value = 8.768e-11

. display "H_0: coef>=0  p-value = " normal(`sign_grade'*sqrt(r(chi2)))
H_0: coef>=0  p-value = 1

. test age-grade = 0

 ( 1)  [union]age - [union]grade = 0

           chi2(  1) =   27.44
         Prob > chi2 =    0.0000

. local sign_ag = sign(_b[age]-_b[grade])

. display "H_0: age coef >= grade coef. p-value = " normal(`sign_ag'*sqrt(r(chi2)))
H_0: age coef >= grade coef. p-value = 8.112e-08

.

Kind regards,
Carlo
(StataNow 18.5)

Comment

Charles Williams

Join Date: Jun 2018

Posts: 2
#6

02 Sep 2021, 11:44

I don't know if it's possible to revive this old discussion of doing one-side tests of inequality between coefficients. I want to use the approach described here, but I'm not sure why the test is framed with the focal inequality as the null hypothesis. As a result, the p-value is coded the opposite of what I'm used to. I would have expected 1- the number reported here. Is there a good reason for this? Thank you!
Comment

Announcement