Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to create a table that display a variable's summary stats by year and status

    Hi!

    As indicated in the title, I wonder how I can create a table to display a variable's summary stats by year and status. Specifically, the variable of interest is city, and staus takes 4 values, 0, 1, 2, and 3. I have two years, 2019 and 2020 in my sample.

    I use this code:
    Code:
    bysort year status: sum(city)
    It gives me what I want, but I would like everything to be on a table, maybe sth like the table below. Thanks in advance for any idea on how I can achieve this.
    Status
    Year = 0 = 1 = 2 = 3
    2019 Obs:
    Mean:
    Obs:
    Mean:
    Obs:
    Mean:
    Obs:
    Mean:
    2020 Obs:
    Mean:
    Obs:
    Mean:
    Obs:
    Mean:
    Obs:
    Mean:
    Last edited by Lucy Garcia; 12 May 2023, 21:39.

  • #2
    Code:
    table (year) (status), stat(count city) stat(mean city)
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Hi Maarten,

      Thanks a lot for your help. However, I tried but get an error message -option stat() not allowed-, I wonder why is this?

      Comment


      • #4
        Best guess is that you are not using an up-to-date version of Stata, perhaps Stata 16. See https://www.statalist.org/forums/help#version for the importance of indicating out-of-date versions. If you tell us the version you are using, some other command may be possible.

        Comment


        • #5
          Hi Nick,

          Yes I'm using stata 16. If I'm using an out of date version, is there any way to turn around?

          Comment


          • #6
            you don't supply an easy-to-use data example with -dataex-, but the following has been available since long before version 16:
            Code:
            sysuse auto
            tab rep78 forei, su(mpg)
            note that you can suppress the sd's if you want; see
            Code:
            help tabulate_summarize
            also, see
            Code:
            h dataex

            Comment


            • #7
              Hi again,

              Sorry about not providing some example data in the first place. Here are some data
              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input float(year status district city province)
              2019 3  23  23   9
              2020 3  26  26  14
              2019 0  26  20   5
              2020 2  59  47  23
              2019 3   3   3   1
              2020 1  49  49  41
              2019 1   8   7   4
              2020 0   8   8   4
              2019 3   8   8   6
              2020 3  13  13  12
              2019 0  13   8   6
              2020 3  40   6   5
              2019 3  27  27  14
              2020 2  31  31  22
              2019 1 139 139  43
              2020 1 135 135  29
              2019 3   2   2   2
              2019 2   7   7   2
              2020 0   5   5   4
              2019 0  15  15   7
              2020 0  82  82  51
              2019 0   8   0   0
              2020 3  12   5   1
              2019 3  57  57  57
              2020 3 147 147 147
              2019 1   1   1   1
              2020 3  27  27  27
              2019 2   9   9   6
              2020 2  26  26  22
              2019 1   8   8   4
              2020 2   6   6   1
              2019 0  20  20  19
              2020 3  96  96  96
              2019 3  10  10   9
              2020 3   9   9   9
              2019 3   2   2   2
              2020 3   9   9   9
              Apart from the city variable I mentioned in previous post, I also have two other variables, district and province, I also want the summary statistics of these variables by year and status (especially n and mean). I tried another way of doing this
              Code:
              table year status, c(n district mean district mean city mean province) format(%9.2f)
              It gives me a table looks like this (give that the number of observations is the same for district, city, and province, so I only put -n district- here, and I tried if I have more than 5 items, stata would say -too many stats()-)
              Code:
              --------------------------------------
                        |              status      
                   Year |     0      1      2      3
              ----------+---------------------------
                   2019 | 1,042     84    244  1,842
                        | 15.56  18.70  16.53  19.34
                        | 13.96  16.24  15.32  17.85
                        | 10.55  10.54  11.71  13.74
                        | 
                   2020 | 1,184     97    294  2,115
                        | 33.56  43.25  31.63  36.67
                        | 30.66  36.95  29.29  33.81
                        | 23.67  28.74  23.40  26.90
              --------------------------------------
              The table looks much better than the one generated in post #1, but just one thing, how can I possibly put a variable's (such as city) status in two years in two adjacent rows, so it would make comparison easier. Thanks for any suggestions!

              Comment

              Working...
              X