Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • CDF graph with confidence interval

    Hello -

    I would like to see whether I can add confidence interval (95%) to the cumulative distribution function graph.

    Here is sample data:

    X
    0.086957
    0.010033
    0.053512
    -0.25418
    -0.47492
    -0.47492
    -0.38796
    -0.45485
    -0.36789
    -0.32441
    -0.23746
    -0.19398
    -0.10702
    -0.02007
    -0.09699
    -0.17391
    0

    I graph the cdf by the following code:

    cumul X, generate(cum)
    sort cum
    line cum X, ylab(, grid) ytitle("") xlab(, grid)

    I manually calculated the upper and the lower bound of the X via: ci means X
    such that
    X mean_X X_high X_low
    0.086957 -0.2010624 -0.2991645 -0.1029602
    0.010033 -0.2010624 -0.2991645 -0.1029602
    0.053512 -0.2010624 -0.2991645 -0.1029602
    -0.25418 -0.2010624 -0.2991645 -0.1029602
    -0.47492 -0.2010624 -0.2991645 -0.1029602
    -0.47492 -0.2010624 -0.2991645 -0.1029602
    -0.38796 -0.2010624 -0.2991645 -0.1029602
    -0.45485 -0.2010624 -0.2991645 -0.1029602
    -0.36789 -0.2010624 -0.2991645 -0.1029602
    -0.32441 -0.2010624 -0.2991645 -0.1029602
    -0.23746 -0.2010624 -0.2991645 -0.1029602
    -0.19398 -0.2010624 -0.2991645 -0.1029602
    -0.10702 -0.2010624 -0.2991645 -0.1029602
    -0.02007 -0.2010624 -0.2991645 -0.1029602
    -0.09699 -0.2010624 -0.2991645 -0.1029602
    -0.17391 -0.2010624 -0.2991645 -0.1029602
    0 -0.2010624 -0.2991645 -0.1029602

    I would like to see whether I can add X_high and X_low to the existing graph created above.
    Could anyone give me a hand. I tried adding the X_high and X_low to the line cum X, but am not successful.
    Thank you.




  • #2
    This is not answering your question mechanically, but I think what you want to do is incorrect.

    The points on the CDF are not averages, they are percentiles. Hence you need to use distribution theory for percentiles, not for means.

    Comment


    • #3


      I think I understand this. You want a confidence interval for the mean as a backdrop to a cumulative distribution plot. (By the way, I advise against the variable name cum. It's not illegal in Stata, but it has a vulgar meaning and so should be avoided in technical work.)

      This can be a little tricky to do well, but there is help in various places such as https://journals.sagepub.com/doi/pdf...867X1601600315

      What is crucial here is what goes on top and what goes underneath. xline() for example is always laid down first, so before any data, so a bar on top just obscures it.

      Here is some technique.

      Code:
      clear
      input X
      0.086957
      0.010033
      0.053512
      -0.25418
      -0.47492
      -0.47492
      -0.38796
      -0.45485
      -0.36789
      -0.32441
      -0.23746
      -0.19398
      -0.10702
      -0.02007
      -0.09699
      -0.17391
      0
      end
      
      sort X
      cumul X, generate(cdf)
      ci means X
      local mean = r(mean)
      gen X2 = cond(_n == 1, r(lb), r(ub))
      local barw = r(ub) - r(lb)
      gen one = 1
      set scheme s1color
      
      twoway bar one X2, barw(`barw') color(blue*0.1) || line cdf X, lc(red) c(J) sort ylab(, ang(h) grid) ytitle("") xlab(, grid) legend(off)  plotregion(margin(0 0 0 0)) || scatteri 1 `mean' 0 `mean', recast(line) ytitle(cumulative probability) xtitle(X)

      Click image for larger version

Name:	cizone.png
Views:	1
Size:	15.4 KB
ID:	1696463

      Last edited by Nick Cox; 09 Jan 2023, 04:22.

      Comment

      Working...
      X