Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identifying the Distribution of Data

    Hi All,

    the COVID-19 epidemics proposed the need to estimate the distribtion of the serial interval, i.e. the time between the start of symptoms in the primary patient (infector) and onset of symptoms in the patient receiving that infection from the infector (the infectee).

    We frequently read that it follows a gamma distribution.
    In Puglia, a region of Italy, we have our data where we can identify in several cases the infector and the infectee. Therefore we would like to estimate the appropriate distribution of the serial interval in our region.
    In practice, we have the sympoms onset date of the infector and the sympoms onset date of the infectee/s.

    Can someone suggest how to identify the distribution starting from our data?

    Thanks.
    Enzo

  • #2
    Hi Vincenzo,

    Welcome to Statalist. You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

    Anyway, why don't you just calculate the difference between the two dates and then plot it using kdensity?

    Mattia


    Notwithstanding, that I am unable to see your data, I imagine that you can plot a kernel of the difference (in days) between

    Comment


    • #3
      Thanks Mattia,

      for this first tip about the use of a kernel density plot

      The Anderson-Darling test is used for testing if data in a variable came from a particular distribution like normal, uniform, lognormal, logistica exponential, Weibull, gamma etc.
      In the case of serial intervals of COVID-19 cases I believe that gamma or lognormal distributions should be among the best candidates.
      It seems that this test has some limited implementation in Stata, but sometimes resources, hints from Stata community are surprising.

      Furthermore, it could be useful also some graph for checking if the selected distribution fits our data.

      Did any other Stata User face the same task?

      Best wishes
      Enzo






      Comment

      Working...
      X