Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Different results when using splines vs interaction variables

    Hi. I am attempting to manually run a single-group ITSA for my study that has two interventions at week 78 and 107 (the in-built stat command doesn't work for my purpose). I tried two methods (from here: https://rpubs.com/mbounthavong/itsa_stata):
    1. using spline (mkspline) to generate new variables that take into consideration the time after the intervention
    2. the more conventional method using interaction variables
    Both methods should yield the same results. However, as seen in the stata output below the results for the "period" variable differ, while the results for the other variables are the same. Could someone please help me understand why there is this difference?


    Code:
    *1. Using splines
    
     *creating knots
    mkspline knot1 78 knot2 107 knot3 = week_num, marginal
    
    
    // Create a variable to separate the treatment periods
    gen period = .
        replace period = 0 if week_num < 78                                      /* Baseline phase */ 
        replace period = 1 if week_num >= 78 & week_num<107 & !missing(week_num) /* Mandate phase */
        replace period = 2 if week_num >= 107 & !missing(week_num)                 /* Platform phase */
    tab period, m 
    
    
    
    xtset con_id week_num
    xtreg week_docalltele i.period c.knot1 c.knot2 c.knot3 
    predict linear_spline
    
    
    *Output 1
    
    ------------------------------------------------------------------------------
    week_docal~e | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          period |
              1  |   1.379151   .2534505     5.44   0.000     .8823975    1.875905
              2  |   .8693821   .4655648     1.87   0.062    -.0431081    1.781872
                 |
           knot1 |  -.0583853   .0030684   -19.03   0.000    -.0643993   -.0523713
           knot2 |    .053078   .0136945     3.88   0.000     .0262372    .0799188
           knot3 |   .0147004   .0188625     0.78   0.436    -.0222693    .0516702
           _cons |    7.62364   .3468154    21.98   0.000     6.943895    8.303386
    -------------+----------------------------------------------------------------
         sigma_u |   2.663207
         sigma_e |  3.3603617
             rho |  .38579202   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    . predict linear_spline
    (option xb assumed; fitted values)
    
    
    
    2. Using interaction variables
    
    
    // Create a variable to separate the treatment periods
    gen period = .
        replace period = 0 if week_num < 78                                      /* Baseline phase */ 
        replace period = 1 if week_num >= 78 & week_num<107 & !missing(week_num) /* Mandate phase */
        replace period = 2 if week_num >= 107 & !missing(week_num)                 /* Platform phase */
    tab period, m 
    
    
    xtset con_id week_num
    xtreg week_docalltele i.period c.week_num i.period#c.week_num
    predict linear_spline
    
    
    *Output 2
    
    -----------------------------------------------------------------------------------
      week_docalltele | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    ------------------+----------------------------------------------------------------
               period |
                   1  |  -2.760932   1.231048    -2.24   0.025    -5.173741   -.3481229
                   2  |  -4.843649   1.624719    -2.98   0.003    -8.028039   -1.659259
                      |
             week_num |  -.0583853   .0030684   -19.03   0.000    -.0643993   -.0523713
                      |
    period#c.week_num |
                   1  |    .053078   .0136945     3.88   0.000     .0262372    .0799188
                   2  |   .0677784   .0136487     4.97   0.000     .0410274    .0945295
                      |
                _cons |    7.62364   .3468154    21.98   0.000     6.943895    8.303386
    ------------------+----------------------------------------------------------------
              sigma_u |   2.663207
              sigma_e |  3.3603617
                  rho |  .38579202   (fraction of variance due to u_i)
    -----------------------------------------------------------------------------------
    
    . predict linear_spline
    
    
    ***Data sample
    
    
    input float week_num long con_id float week_docalltele
    1  4 12
    1 11  5
    1 12 10
    1 13  2
    1 17  8
    1 22  7
    1 23 19
    1 24  4
    1 25  3
    1 26  3
    1 30  7
    1 31  7
    1 33  2
    1 37 12
    1 38 11
    1 40 16
    1 41 14
    1 42  1
    1 43  3
    1 45  8
    1 46  2
    1 51  4
    1 54 22
    1 55  2
    1 56  9
    1 57 17
    1 58 21
    1 59 20
    1 61  2
    1 63  6
    1 65  2
    1 67  9
    1 69  8
    1 71 24
    2 12 17
    2 13  6
    2 16 12
    2 17 12
    2 23 14
    2 25  2
    2 26  5
    2 28  2
    2 31  7
    2 32  3
    2 33  1
    2 35 10
    2 37 14
    2 38  6
    2 39  5
    2 41  1
    2 42  6
    2 43  6
    2 45 12
    2 46  2
    2 48  1
    2 49  9
    2 50  4
    2 51  6
    2 54 19
    2 55  1
    2 56 10
    2 57 20
    2 58 25
    2 59 16
    2 61  8
    2 62  6
    2 65  2
    2 67 15
    2 69  6
    2 70 15
    2 71 17
    3  4  6
    3 11 15
    3 12 15
    3 16 10
    3 17  9
    3 22 14
    3 24 14
    3 25  2
    3 27 10
    3 28  3
    3 29  3
    3 31 12
    3 33  5
    3 37  4
    3 38 10
    3 39 10
    3 40 20
    3 41 14
    3 42  5
    3 43  7
    3 45 21
    3 46  1
    3 48  1
    3 49  5
    3 50  5
    3 51  8
    3 56  1
    3 57 27
    3 58 19

  • #2
    The difference is because the period variable takes on different meanings in these two representations of the model. Both models are algebraically equivalent: the predicted values they result in are exactly the same.

    In the interaction model, each period variable represents the expected outcome value in that period when week_num == 0. Since week_num is never 0, even in the first period, this is a hypothetical, or, you could say, meaningless output.

    In the spline model, each period variable represents the "jump" in the expected value at knot points.

    So there is no reason to expect these to come out the same.
    Last edited by Clyde Schechter; 22 Nov 2024, 13:44.

    Comment


    • #3
      In the second model, (2.period#c.week_num - 1.period#c.week_num) equals the knot3 coefficient from the first. The others match up.

      Comment


      • #4
        Clyde Schechter and George Ford: Thank you for that! That really helps me understand this.

        I have some following questions:

        1. I have two interventions in my study. Each represented by knot 1 and knot 2. Given George's explanation in #3, I guess this means that knot3 is the difference in slopes between intervention 1 and intervention 2 (aka, the difference in the rate of change of the two slopes)? How can I using the splines method get the difference in slopes between intervention 2 and the pre-intervention period?

        2. Also, I want to compare the slopes after each of the intervention. I'm using the lincom command to do so. When I do "lincom knot1+knot2", I understand that this is the slope of intervention 1 compared to the pre-intervention period. I also want to estimate the slope of intervention 2 vs the pre-intervention period, and the slope of intervention 2 vs intervention 1. How can I do that using lincom and the knots?

        Thank you for all the help!

        Comment


        • #5
          Code:
          clear all
          
          set obs 150
          
          g t = _n
          tsset t
          
          mkspline knot1 50 knot2 100 knot3 = t, marginal
          
          g period = 1 + 1*(knot2>0) + 1*(knot3>0)
          
          ** SLOPE 1: = 0.1
          ** SLOPE 2: = 0.3
          ** SLOPE 3: = 0.5
          g y = 1 + 0.1*t + 0.2*knot2 + 0.3*knot3 + rnormal(0,0.1)
          
          tsline y 
          twoway scatter y t , color(black%20) jitter(2) || lfit y t if period==1 || lfit y t if period==2 || lfit y t if period==3
          
          eststo e1: qui reg y knot1 if period==1
          eststo e2: qui reg y knot2 if period==2
          eststo e3: qui reg y knot3 if period==3
          esttab e1 e2 e3
          
          reg y i.period knot1 knot2 knot3 
          lincom knot1
          lincom knot1+knot2
          lincom knot1+knot2+knot3
          lincom knot2+knot3
          
          reg y i.period c.t i.period#c.t
          lincom t
          lincom t + 2.period#c.t
          lincom t + 3.period#c.t

          Comment


          • #6
            George Ford Thank you. I'm afraid I don't follow it completely. Are the first three below correct?
            1. knot1 is the slope before any intervention
            2. knot1+knot2 is the slope after intervention 1, i.e. intervention 1 vs the pre-intervention period
            3. knot1+knot2+knot3 is the slope after intervention 2, i.e intervention 2 vs the pre-intervention period
            4. What is knot2+knot3?

            Comment


            • #7
              4. Nothing that might interest you, but it is a test of the difference between slope in period 3 and period 1.

              (lincom knot1+knot2+knot3 - knot1) = (lincom knot2+knot3)

              Comment

              Working...
              X