Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting number of times an observation appears within a group

    Hi all, I am a new user to STATA and using a dataset with ~9000 observations and two main variables of interest: ir_no and ir_no_1. I am trying to count the number of times a given value of ir_no_1 appears for each ir_no. Here is a mock table of what the data looks like with the variable I am trying to make ("new_var").
    ir_no ir_no_1 new_var
    111 abc22 2
    111 abc22 2
    111 abc11 1
    222 abc33 2
    222 abc22 1
    222 abc33 2
    I tried the following code, but something must be incorrect since it gives me the total number of observations per group:
    Code:
    gen calc = .
    ​foreach i in ir_no_1 {
        by ir_no: egen calc2 = count(ir_no_1) if ir_no_1 == `i'
        replace calc = calc2 if ir_no_1 == `i'
        drop calc2
        }
    Is there an easy way to do this? I looked around the forums for a quick solution, but have been unsuccessful. Thanks in advance for any advice.

  • #2
    I think you want
    Code:
    by ir_no ir_no_1, sort: gen new_var = _N
    Last edited by Clyde Schechter; 06 Nov 2015, 15:52. Reason: To put the code into a proper code block.

    Comment


    • #3
      That worked perfectly. Thanks!

      Comment


      • #4
        Hi Clyde!

        I had a similar issue and tried using the same code you provided. It seemed to work but upon further inspection I could see that the numbers being provided were incorrect.

        I have a dataset with locations at 2 levels, and various characteristics per location. I am concerned with counting the number of a specific characteristic per subdistrict, and would like to label each observation in that subdistrict with that same value.

        SubDistrictE Village group1 group2 group3

        group1 - group3 are binary dummies.

        I tried:

        by SubDistrictE group1, sort: gen ngroup1 = _N

        It seemed to work for group1, then it gave the exact same values for the other two.

        Appreciate any recommendations!

        Comment


        • #5
          I don't understand what you want to do, nor do I see what your data look like. Please post back with an example of your data, created using the -dataex- command. Then illustrate for one or two of your subdistricts from the example data what results you are hoping to calculate.

          If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.



          When asking for help with code, always show example data. When showing example data, always use -dataex-.

          Comment

          Working...
          X