Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Collapse/Summarize Question

    I have US daily confirmed deaths and cases from the coronavirus pandemic that I need to collapse for use in another dataset. Johns Hopkins collected this data at the county level in each state, however I only need the statewide total. See snippet below:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float date2 str41 admin2 str24 provincestate long confirmed int deaths
    21936 "Autauga" "Alabama" 0 0
    21937 "Autauga" "Alabama" 0 0
    21938 "Autauga" "Alabama" 0 0
    21939 "Autauga" "Alabama" 0 0
    21940 "Autauga" "Alabama" 0 0
    21941 "Autauga" "Alabama" 0 0
    21942 "Autauga" "Alabama" 0 0
    21943 "Autauga" "Alabama" 0 0
    21944 "Autauga" "Alabama" 0 0
    21945 "Autauga" "Alabama" 0 0
    21946 "Autauga" "Alabama" 0 0
    21947 "Autauga" "Alabama" 0 0
    21948 "Autauga" "Alabama" 0 0
    21949 "Autauga" "Alabama" 0 0
    21950 "Autauga" "Alabama" 0 0
    21951 "Autauga" "Alabama" 0 0
    21952 "Autauga" "Alabama" 0 0
    21953 "Autauga" "Alabama" 0 0
    21954 "Autauga" "Alabama" 0 0
    21955 "Autauga" "Alabama" 0 0
    21956 "Autauga" "Alabama" 0 0
    21957 "Autauga" "Alabama" 0 0
    21958 "Autauga" "Alabama" 0 0
    21959 "Autauga" "Alabama" 0 0
    21960 "Autauga" "Alabama" 0 0
    end
    format %td date2
    I believe the correct collapse code would be something like collapse v1 v2...by(provincestate) HOWEVER, I need the actual counts and not a mean. Is there a command that will do this? Maybe Sum? Please advise.

  • #2
    See -help collapse-. The default stat is mean, but you can also use sum:

    Code:
    collapse (sum) confirmed deaths,by(provincestate date2)
    Last edited by Ali Atia; 03 Jun 2021, 13:42.

    Comment


    • #3
      What I need is to maintain the integrity of the daily numbers but for the entire state, not the counties. Will -collapse allow for that if I use -sum as part of the call?

      Comment


      • #4
        The command in #2 will add up all of the values for individual counties within a state on a given date such that the resulting dataset contains a single observation (summing up the individual county numbers) per state per date.

        Comment

        Working...
        X