Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Means and their difference confidence interval

    Dear Statalisters,

    I've collapsed my dataset via the collapse (means) option into two groups based on whether a dummy variable is 0 or 1. I've also created a difference variable Diff between the two means. I tried plotting the two means in a graph and it went well but due both of them being very close to each other the distance between them is nearly invisible. A large drop in one of the quarters also makes the Y-axis stretched making the difference even less visible.

    What I am trying to do then is to plot this difference over time to see whether the difference has increased from the first period or decreased over the following periods and whether this increase/decrease remains zero with 95% confidence (that is the original difference between means over the period has not statistically differed from zero).

    Here is a sample of my data:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double(Lowcap0 Lowcap1) float(Quarter Diff)
    -.0037198  -.004552 233  .0008322
    -.0052748 -.0050738 234  -.000201
     .0568899  .0532897 235  .0036002
     -.002273 -.0060779 236  .0038049
     .0597479  .0575683 237  .0021796
     .0538835  .0507662 238  .0031173
     .0497061  .0452972 239  .0044089
    -.0367175 -.0274895 240  -.009228
    -.2862794 -.2742874 241  -.011992
    -.0271649 -.0317676 242  .0046027
    -.0198085 -.0292484 243  .0094399
    -.0173205 -.0161284 244 -.0011921
    -.0166656 -.0166235 245 -.0000421
    end
    format %tq Quarter
    Here are the plots I've managed to graph, first the two means by group over time and second graph plots the difference between these two means:

    Click image for larger version

Name:	Screenshot 2021-11-08 at 20.55.21.png
Views:	1
Size:	122.0 KB
ID:	1635508 Click image for larger version

Name:	Screenshot 2021-11-08 at 20.55.43.png
Views:	1
Size:	122.3 KB
ID:	1635509

    If anyone could suggest how to get that second plot to check whether the change in difference between means remains statistically indifferent from zero I would be extremely grateful. I've tried looking through Statalist and cannot figure it out.

    Thank you in advance!

  • #2
    The example data you show comes from the -collapse-d version of the data. It is not possible to calculate the relevant confidence intervals or standard errors from the data in that form, as you have only one observation per time point. If you show an example of the data from before the -collapse-, it is likely it can be done.

    Added: Why do you want to do that? Looking at the first graph, it is visually obvious that the separation of the two curves in late 2020 is some vanishingly small amount relative to the inherent variation in the variable lowcap itself. So even if your original data set has a bazillion observations and this microscopic difference turns out to be "statistically significant," why would you care? You wouldn't take that seriously, would you? I certainly wouldn't.
    Last edited by Clyde Schechter; 08 Nov 2021, 13:45.

    Comment


    • #3
      Dear Clyde,

      Thank you for the reply! I had a suspicion that it would not be enough as I think I am mistaking two different things for one.

      I have estimated a difference-in-difference model and I'm trying to produce a visually more understandable graph for parallel trends assumption test. I've estimated the following model:
      Code:
       xtreg MortChange Post##LowCapital LogAssets Capital Liquidity i.Quarter, fe vce(cluster ID)
      I think that the tiny difference can be quite significant for such a model and possibly make the parallel trends assumption not hold. MortChange variable measures the change in mortgage interest rate from one quarter to the other (Interest/l.Interest - 1), Post = 1 if Quarter is past 239, and LowCapital is 1 through the whole period if it's below the mean value at Quarter = 238. If my understanding is correct then statistically insignificant difference across time between the two means would give evidence for parallel trends which is why I am looking to map this difference with 95% confidence intervals.

      I hope it's clear what I am trying to do, if not please ask and I will try to elaborate further.

      Here's the snippet from the full data.
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input int ID float(Quarter Capital LogAssets MortChange Post LowCapital Interaction Liquidity)
      1020 232 .11971173 15.425667             . 0 0 0  .013955366
      1020 233 .11712634 15.481427  -.0005400788 0 0 0  .013251254
      1020 234 .12009694  15.48467   -.007108803 0 0 0  .013241507
      1020 235 .12052993  15.50936    .067535885 0 0 0  .013353472
      1020 236 .12220547 15.521126  -.0023337016 0 0 0  .012707525
      1020 237 .11842269  15.59011     .07507282 0 0 0   .01190707
      1020 238 .12088402 15.596017     .05490055 0 0 0   .01172335
      1020 239 .12328812 15.601577     .02993558 0 0 0  .012230604
      1020 240 .12448455 15.704634   -.019060176 1 0 0  .010397565
      1020 241 .12443394  15.72799     -.3315296 1 0 0   .01118917
      1020 242  .1265727 15.726953    -.02610723 1 0 0   .01162767
      1020 243 .13152587 15.732798   -.018394914 1 0 0   .01112565
      1020 244 .13385454 15.732903    .019243155 1 0 0   .01074584
      1020 245 .13439538 15.745996    -.02308631 1 0 0  .011172013
      1105 232 .12674521 14.901415             . 0 0 0  .019384453
      1105 233   .120095 14.994535   -.003696662 0 0 0   .03001414
      1105 234 .12104427 15.002126    .002958361 0 0 0   .02165859
      1105 235 .12460075 14.996237     .09176072 0 0 0  .020604974
      1105 236 .12274332 15.030716    .006904318 0 0 0  .017814076
      1105 237 .12097165  15.08932     .06404913 0 0 0  .019091893
      1105 238 .12208101 15.099487     .05628258 0 0 0  .017116725
      1105 239 .12703487 15.082226     .04440312 0 0 0   .01961808
      1105 240  .1402424  15.11517   -.005343342 1 0 0  .020302346
      1105 241 .14127792 15.130983    -.32739705 1 0 0  .020192334
      1105 242 .14638974 15.112276    -.00881553 1 0 0  .020664677
      1105 243 .14840126 15.126732   -.010842834 1 0 0   .01454711
      1105 244  .1496849 15.132628   -.011540247 1 0 0  .014407734
      1105 245  .1498256 15.144052    -.00791344 1 0 0  .019624794
      1140 232 .12881997 14.852843             . 0 0 0  .006139246
      1140 233  .1314806 14.865655   .0018556876 0 0 0  .006533094
      1140 234 .13584116 14.864736 -.00054611714 0 0 0  .006533148
      1140 235  .1305501 14.908412      .0999206 0 0 0  .019540945
      1140 236 .12995382 14.930692  -.0036603084 0 0 0   .02580323
      1140 237  .1335947 14.948956     .06952371 0 0 0    .0254328
      1140 238  .1362565  14.95521     .04677353 0 0 0  .025134804
      1140 239 .13327701 14.979696     .05436152 0 0 0  .025268214
      1140 240 .14605601 14.968538   -.015613738 1 0 0   .02465172
      1140 241 .14905673 14.980376    -.24698496 1 0 0  .022815226
      1140 242  .1547136 14.973707     -.0386949 1 0 0   .02307622
      1140 243  .1545041 14.988928    -.08212663 1 0 0  .013358095
      1140 244 .15295476 15.015976   -.034761887 1 0 0  .012872471
      1140 245 .15357067 15.024996   -.017440228 1 0 0  .012917393
      1254 232 .08622065 13.830753             . 0 1 0  .065201506
      1254 233 .08273336 13.883726   -.014917805 0 1 0   .08144996
      1254 234 .08490191 13.870953     .07411842 0 1 0  .062734626
      1254 235 .08424491 13.879768    -.08068459 0 1 0   .06387482
      1254 236  .0814081  13.92783   .0042984467 0 1 0   .06109384
      1254 237 .09009147 13.953928      .0525546 0 1 0   .06134415
      1254 238 .09509503 13.918434 -.00007096445 0 1 0   .05761241
      1254 239  .0940771 13.922132     .09164742 0 1 0   .06147254
      1254 240  .1053309 13.989075   -.003457106 1 1 1   .07149757
      1254 241 .10189064  14.02986    -.26802114 1 1 1   .05183148
      1254 242  .1004261 14.040496    -.02918645 1 1 1   .04880365
      1254 243 .10354827  14.01955   -.024943784 1 1 1   .05219489
      1254 244 .09481572  14.11231   -.027588887 1 1 1   .07229625
      1254 245 .09187865 14.147603    -.01579224 1 1 1  .035918392
      end
      format %tq Quarter

      Comment


      • #4
        I think the simplest way to get the confidence intervals from the uncollapsed data is with:
        Code:
        capture program drop one_quarter
        program define one_quarter
            ci means MortChange
            gen mean = r(mean)
            gen lb = r(lb)
            gen ub = r(ub)
            exit
        end
        
        runby one_quarter, by(Quarter LowCapital)
        -runby- is by Robert Picard and me, and is available from SSC.

        Then you can plot the means and confidence intervals after you collapse -by(Quarter LowCapital)- and reshape wide.

        Comment


        • #5
          Sorry, #4 gives code for the confidence intervals around the group means in MortChange. But you want one confidence interval around the difference. That's:

          Code:
          capture program drop one_quarter
          program define one_quarter
              regress MortChange i.LowCapital
              matrix M = r(table)
              gen diff = M["b", "1.LowCapital"]
              gen lb = M["ll", "1.LowCapital"]
              gen ub = M["ul", "1.LowCapital"]
              exit
          end
          
          runby one_quarter, by(Quarter)
          Then you can just -collapse- over Quarter, and plot diff, ub and lb against Quarter.

          Comment


          • #6
            Thank you so much for your help Clyde!

            It was exactly what I was looking for and got the intended results. Your help to me and the other members of this forum is very appreciate and not taken granted for!

            Comment

            Working...
            X