Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Number of members in a particular age group when observations are repeated

    Hello,

    I want to create 4 variables indicating whether girls and boys in the age group 0-5 and 6-14 are present in the household and then another 4 for how many are there. I'm using the below codes:

    Code:
    by common_ID, sort: egen girls_0_5 = max((relationship==5|relationship==6) & gender==2 & age>=0 & age<=5)
    by common_ID, sort: egen boys_0_5 = max((relationship==5|relationship==6) & gender==1 & age>=0 & age<=5)
    by common_ID, sort: egen girls_6_14 = max((relationship==5|relationship==6) & gender==2 & age>=6 & age<=14)
    by common_ID, sort: egen boys_6_14 = max((relationship==5|relationship==6) & gender==1 & age>=6 & age<=14)
    I'm not able to get the correct codes for constructing the rest four variables indicating the numbers in each group. Please help.

    An example dataset:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str32 common_ID byte(relationship gender) int age str35 ind_id
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 1 1 67 "TUS10001106201913310301382332001001"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 2 2 61 "TUS10001106201913310301382332001002"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 3 1 33 "TUS10001106201913310301382332001003"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 4 2 26 "TUS10001106201913310301382332001004"
    "TUS10001106201913310301382332001" 6 1  5 "TUS10001106201913310301382332001005"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    "TUS10001106201913310301382332004" 1 2 52 "TUS10001106201913310301382332004001"
    end
    Thanks


  • #2
    What relationship means and why it bears on your calculation are both unclear to me.

    If gender and age are the qualifiers, it should be sufficient as well as necessary to know both.

    This may help:
    Code:
    by common_ID, sort: egen girls_0_5 = total(gender==2 & inrange(age, 0, 5))
    Here using inrange() is just a slightly neater way of avoiding expressions such as age >= 0 & age <= 5 Otherwise max() just records whether any such girls are present. It doesn't count how many there are. For that, score 1 if criteria are satisfied and 0 otherwise; then the total score is the count needed.
    Last edited by Nick Cox; 18 Jan 2024, 06:21.

    Comment


    • #3
      Hello, Nick. Thank you for pointing out the relationship thing. In the dataset, the observations are repeating for individuals aged 6 or above. Hence, I'm not getting the right count for such cases.

      An example dataset:
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str32 common_ID str35 ind_id byte gender int age
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03001" 2 85
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03002" 1 47
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03004" 1 10
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03005" 2  7
      "TUS10002106201911420102221MAN03" "TUS10002106201911420102221MAN 03003" 2 42
      end
      Thanks

      Comment


      • #4
        Indeed. My code does not solve any problem of duplicate observations, which you need to address otherwise, say with the duplicates command.

        Comment


        • #5
          Thanks, Nick.

          Comment


          • #6
            Hello, Nick. I tried it using the below codes, but the number just displays in the cell where new_ind_id==1. I want it to be displayed across all the observations for a household.

            Code:
            sort ind_id
            by ind_id: gen new_ind_id = 1 if _n==1
            by common_ID, sort: egen no_girls_0_5 = total(gender==2 & inrange(age, 0, 5)) if new_ind_id==1

            Comment


            • #7
              Your code selects precisely one observation for each distinct ind_id. So any counts within that framework will return either 1 or 0.

              I guess what you want is more like

              .
              Code:
               egen long tag = tag(common_ID ind_id) 
              
              . by common_ID, sort: egen girls_0_5 = total(tag & gender==2 & inrange(age, 0, 5))
              
              . tabdisp common_ID, c(girls_0_5)
              
              --------------------------------------------
                                    common_ID |  girls_0_5
              --------------------------------+-----------
              TUS10002106201911420102221MAN03 |          0
              --------------------------------------------
              The result is 0, but that appears right. There are 5 distinct observations out of 100 in your data example and none is a girl between 0 and 5.

              As implied,

              Code:
              help duplicates

              Comment


              • #8
                Thank you, Nick. It helps.

                Comment


                • #9
                  Sorry, the tag variable doesn't have to be long; it could be byte. That doesn't undermine the rest of the suggestion.

                  Comment

                  Working...
                  X