Exploring non-linear relationships

April Kimm

Join Date: Mar 2021

Posts: 45
#1

Exploring non-linear relationships

06 Feb 2022, 10:49

Hello, my DVs are binary variables and I use firthlogit because (1) it is a rare event (excessive zeros) and (2) data separations. In particular, I am interested in exploring the temporal patterns of the mode of warfare. My hypotheses are:

(1) Insurgents are more likely to engage in terrorism in the initial phase of a conflict.
(2) Insurgents are most likely to employ guerrilla warfare during the interim phase of a conflict (an inverted U-shaped relationship).
(3) Insurgents are more likely to conduct conventional warfare during later phases of a conflict.

I have following questions.
1. For Hypothesis (3), I posited a linear and positive relationship but "UTEST" by Stata strongly indicates an inverted U-shaped relationship. Also, when I include the squared term of conflict duration, it is negative and significant, suggesting that it is indeed an inverted U-shaped relationship. In this case, should I include the squared term regardless of my theoretical expectations ( a positive linear relationship)?

2. I estimated the models using firthlogit but would like to explore further non-linear relationships between conflict duration and the types of warfare. For example, my questions are (1) Is the pattern a smooth change over time? (2) Is it punctuated or non-linear in some other way? What kids of statistical methods would you recommend in this case?

Thank you very much for your help in advance!

Last edited by April Kimm; 06 Feb 2022, 11:28.
Tags: None

Clyde Schechter

Join Date: Apr 2014
Posts: 30100

06 Feb 2022, 11:18

I don't know much about GAMs so I'm only responding here to question 1.

You should not rely on -utest-, nor on a significant coefficient of the quadratic term in the model, to conclude that you have a U-shaped (upright or invered) relationship.

Code:

. clear

. set obs 100
Number of observations (_N) was 0, now 100.

. gen x = _n

. gen xsq = x*x

. gen y = log(x)

.
. regress y x xsq

      Source |       SS           df       MS      Number of obs   =       100
-------------+----------------------------------   F(2, 97)        =    673.39
       Model |  79.5383501         2   39.769175   Prob > F        =    0.0000
    Residual |  5.72863857        97   .05905813   R-squared       =    0.9328
-------------+----------------------------------   Adj R-squared   =    0.9314
       Total |  85.2669886        99  .861282713   Root MSE        =    .24302

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           x |   .0738282   .0033998    21.72   0.000     .0670806    .0805758
         xsq |  -.0004472   .0000326   -13.71   0.000     -.000512   -.0003825
       _cons |   1.422281   .0743884    19.12   0.000      1.27464    1.569921
------------------------------------------------------------------------------

.
. utest x xsq

Specification: f(x)=x^2
Extreme point:  82.53892

Test:
     H1: Inverse U shape
 vs. H0: Monotone or U shape

-------------------------------------------------
                 |   Lower bound      Upper bound
-----------------+-------------------------------
Interval         |           1              100
Slope            |    .0729338        -.0156183
t-value          |    21.85871        -4.680915
P>|t|            |    1.47e-39         4.64e-06
-------------------------------------------------

Overall test of presence of a Inverse U shape:
     t-value =      4.68
     P>|t|   =  4.64e-06

illustrates how a clearly non U-shaped relationship can be sufficiently well approximated by a quadratic (which is picking up on curvature) even when the underlying relationship never reaches a turning point and is an altogether different function.

Moreover, there really aren't very many things in the real world that are truly related according to a quadratic function. So quadratic regressions are usually a proxy for some type of non-linearity.

What you can say is that you have a non-linear relationship. That it is specifically inverted U-shaped is a leap too far based solely on what you have described. I suggest a -lowess- plot of the relationship to see if it looks like an inverted U or is just a curvilinear relationship that is getting flatter to the right. More quantitative "testing" for the presence of a U-shaped relationship would involve fitting separate linear regressions (with no quadratic term) on the lower and upper ends of the data to see if the signs of the coefficients are different. For an inverted U, the sign should be positive at the lower end and negative at the upper end. (Just how close to the minimum and maximum values constitute the lower and upper ends of the data is a bit tricky as the sample size might be small on one end or the other, making it hard to pin down the local slope.)

Comment

April Kimm

Join Date: Mar 2021

Posts: 45
#3

06 Feb 2022, 13:19

Thank you for your answer. It is very clear to me. I appreciate it!

Could anyone else answer Question 2?
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3456
#4

06 Feb 2022, 14:48

If you have specific point at which you expect a sudden change, and want to test that, then you can look at this Stata tip: http://maartenbuis.nl/publications/leaps.html

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
April Kimm

Join Date: Mar 2021

Posts: 45
#5

06 Feb 2022, 23:01

Thank you very much. I will look at your article.
Comment

Announcement