Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • tabulate three variables

    Hello everyone,

    is there any way to tabulate three variables in a certain manner? I am looking at a comparison of prisoner data for two different years (2011 and 2015). var1 will be the identifier (prisonerID), var2 contains information whether pre-trial detention was ordered or not, var3 is year and var4 is the type of crime committed.


    If I go with

    tabulate crimetype PTDordered year, column missing rowsort

    stata will tell me that there are too many variables specified.


    tabulate crimetype PTDordered, column missing rowsort

    Will give me the results, but lacks the distinction between the two years of comparison. Is there any way to integrate the year into the table or do i need to produce two separate tables (one for each year) and compare them like this:



    tabulate crimetype PTDordered if year==2011, column missing rowsort


    Any help will be greatly appreciated


    Cheers
    John

    PS: Data example:
    prisonerID PTDordered year crimetype
    001 1 2011 1
    002 0 2011 2
    003 0 2015 2
    004 1 2015 5
    005 1 2015 1
    006 1 2011 5
    007 0 2011 6
    008 1 2015 1
    009 0 2011 3
    010 1 2015 2
    011 0 2015 6
    Last edited by John Asherman; 04 Apr 2022, 08:58.

  • #2
    Please use dataex to show examples (https://www.statalist.org/forums/help#stata).

    See groups from the Stata Journal for one possibility. https://www.statalist.org/forums/for...updated-on-ssc

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str3 prisonerid byte ptdordered int year byte crimetype
    "001" 1 2011 1
    "002" 0 2011 2
    "003" 0 2015 2
    "004" 1 2015 5
    "005" 1 2015 1
    "006" 1 2011 5
    "007" 0 2011 6
    "008" 1 2015 1
    "009" 0 2011 3
    "010" 1 2015 2
    "011" 0 2015 6
    end
    
    . groups year crimetype ptd, percent(year crimetype) sepby(year crimetype)
    
      +----------------------------------------------+
      | year   crimet~e   ptdord~d   Freq.   Percent |
      |----------------------------------------------|
      | 2011          1          1       1    100.00 |
      |----------------------------------------------|
      | 2011          2          0       1    100.00 |
      |----------------------------------------------|
      | 2011          3          0       1    100.00 |
      |----------------------------------------------|
      | 2011          5          1       1    100.00 |
      |----------------------------------------------|
      | 2011          6          0       1    100.00 |
      |----------------------------------------------|
      | 2015          1          1       2    100.00 |
      |----------------------------------------------|
      | 2015          2          0       1     50.00 |
      | 2015          2          1       1     50.00 |
      |----------------------------------------------|
      | 2015          5          1       1    100.00 |
      |----------------------------------------------|
      | 2015          6          0       1    100.00 |
      +----------------------------------------------+


    Code:
    bysort year : tabulate crimetype PTDordered, row missing rowsort
    is another.
    Last edited by Nick Cox; 04 Apr 2022, 09:08.

    Comment


    • #3
      an alternative:
      Code:
      help table

      Comment


      • #4
        Originally posted by Nick Cox View Post
        Please use dataex to show examples (https://www.statalist.org/forums/help#stata).

        See groups from the Stata Journal for one possibility. https://www.statalist.org/forums/for...updated-on-ssc

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str3 prisonerid byte ptdordered int year byte crimetype
        "001" 1 2011 1
        "002" 0 2011 2
        "003" 0 2015 2
        "004" 1 2015 5
        "005" 1 2015 1
        "006" 1 2011 5
        "007" 0 2011 6
        "008" 1 2015 1
        "009" 0 2011 3
        "010" 1 2015 2
        "011" 0 2015 6
        end
        
        . groups year crimetype ptd, percent(year crimetype) sepby(year crimetype)
        
        +----------------------------------------------+
        | year crimet~e ptdord~d Freq. Percent |
        |----------------------------------------------|
        | 2011 1 1 1 100.00 |
        |----------------------------------------------|
        | 2011 2 0 1 100.00 |
        |----------------------------------------------|
        | 2011 3 0 1 100.00 |
        |----------------------------------------------|
        | 2011 5 1 1 100.00 |
        |----------------------------------------------|
        | 2011 6 0 1 100.00 |
        |----------------------------------------------|
        | 2015 1 1 2 100.00 |
        |----------------------------------------------|
        | 2015 2 0 1 50.00 |
        | 2015 2 1 1 50.00 |
        |----------------------------------------------|
        | 2015 5 1 1 100.00 |
        |----------------------------------------------|
        | 2015 6 0 1 100.00 |
        +----------------------------------------------+


        Code:
        bysort year : tabulate crimetype PTDordered, row missing rowsort
        is another.
        Hi Nick,

        thanks for showing me the groups addon!

        I changed it a little bit so it will give me a table that is sorted by crimetype

        groups crimetype year ptd, percent(crimetype year) sepby(year crimetype)

        Now the tables are almost as I imagined them to be. One suggestion to optimize the output to make it even more reader friendly would be to have the table layout like this since it makes finding and comparing the freq./percentages easier because you can stay in the same row as opposed to switching between rows.

        example
        crimetype year 2011 ptd 2011 freq. percent year 2015 ptd 2015 freq. percent
        2 2011 1 1 9.09 2015 1 1 4.76
        2 2011 0 10 90.91 2015 0 20 95.24

        also while typing in the example I realised that using groups will give me 9.09% / 90.91% for a 1 : 10 freq ratio. How can that be?

        Cheers
        John

        Comment


        • #5
          Sorry, but the layout you sketch is not on offer from groups. Saving its data may make easier to get it otherwise.

          More trivially,

          Code:
          . di 1/11
          .09090909
          so I don't see what is wrong with 9.09 and 90.91%.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            Sorry, but the layout you sketch is not on offer from groups. Saving its data may make easier to get it otherwise.

            More trivially,

            Code:
            . di 1/11
            .09090909
            so I don't see what is wrong with 9.09 and 90.91%.
            Right, sorry about the question regarding the percentages. I made a stupid mistake.

            (Hopefully) last question: Since option IF is not allowed how do I select only certain groups (crimetypes)? Say I want to have a table that contains only crimetypes 1, 5 and 8.

            Cheers

            Comment


            • #7
              if would be a qualifier, not an option. It's certainly allowed with groups.

              Comment


              • #8
                Thanks, it works now

                Comment

                Working...
                X