Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Long data format: How to count individuals and not observations?

    Dear all

    I have a problem with my data set that I hope you can help me solve.

    My dataset is in long format and includes 24,752 unique individuals, 314,128 pbervations, 25 different variables, and I work in STATA version 15.

    The purpose of the dataset is to validate a screeningstool therefore the long format. The tool includes 8 items (adbb_item) that each receive a score between 0 - 4 (item_score_adbb) with a total score ranging between 0-32 (totalscore_adbb), and a child can have been screened more that one time.

    My problem is that my dataset calculate observations and not individuals, and I cannot figure out to changes it.

    How data is displayed now when I tabulated
    Child sex Freq Percent Cum
    Boy 160,200 51.00 51.00
    Girl 153,928 49.00 100.0
    Total 314,128
    How I would like data to displayed when I tabulated
    Child sex Freq Percent Cum
    Boy 12,628 51.02 51.00
    Girl 12,124 48.98 100.0
    Total 24,752
    I have tried with

    egen tag = tag(id)

    tab var if tag

    but that wouldn’t work for variables including missing, or the variables that describes the screeningstool.

    How my data look in stata - I have not included all 25 variables in the example below

    Code:
    input float id byte(item_score_adbb totalscore_adbb) long adbb_item float age_days long koen_b float f_vaegt_b
    1 0 0 1 251 1  2.98
    1 0 0 2 251 1  2.98
    1 0 0 3 251 1  2.98
    1 0 0 4 251 1  2.98
    1 0 0 5 251 1  2.98
    1 0 0 6 251 1  2.98
    1 0 0 7 251 1  2.98
    1 0 0 8 251 1  2.98
    2 0 0 1 267 0  4.57
    2 0 0 2 267 0  4.57
    2 0 0 3 267 0  4.57
    2 0 0 4 267 0  4.57
    2 0 0 5 267 0  4.57
    2 0 0 6 267 0  4.57
    2 0 0 7 267 0  4.57
    2 0 0 8 267 0  4.57
    3 0 2 1 246 1 2.666
    3 0 2 2 246 1 2.666
    3 1 2 3 246 1 2.666
    3 0 2 4 246 1 2.666
    3 1 2 5 246 1 2.666
    3 0 2 6 246 1 2.666
    3 0 2 7 246 1 2.666
    3 0 2 8 246 1 2.666
    4 0 0 1  60 0 4.084
    4 0 0 2  60 0 4.084
    4 0 0 3  60 0 4.084
    4 0 0 4  60 0 4.084
    4 0 0 5  60 0 4.084
    4 0 0 6  60 0 4.084
    4 0 0 7  60 0 4.084
    4 0 0 8  60 0 4.084
    5 2 5 1  61 1 2.828
    5 0 0 1 252 1 2.828
    5 1 3 1 122 1 2.828
    5 0 3 2 122 1 2.828
    5 0 5 2  61 1 2.828
    5 0 0 2 252 1 2.828
    5 0 5 3  61 1 2.828
    5 0 0 3 252 1 2.828
    5 0 3 3 122 1 2.828
    5 0 0 4 252 1 2.828
    5 0 5 4  61 1 2.828
    5 0 3 4 122 1 2.828
    5 1 5 5  61 1 2.828
    5 0 0 5 252 1 2.828
    5 1 3 5 122 1 2.828
    5 0 0 6 252 1 2.828
    5 0 5 6  61 1 2.828
    5 0 3 6 122 1 2.828
    5 1 5 7  61 1 2.828
    5 0 0 7 252 1 2.828
    5 1 3 7 122 1 2.828
    5 1 5 8  61 1 2.828
    5 0 0 8 252 1 2.828
    5 0 3 8 122 1 2.828
    6 0 0 1 271 1 3.946
    6 0 0 2 271 1 3.946
    6 0 0 3 271 1 3.946
    6 0 0 4 271 1 3.946
    6 0 0 5 271 1 3.946
    6 0 0 6 271 1 3.946
    6 0 0 7 271 1 3.946
    6 0 0 8 271 1 3.946
    7 0 0 1  60 1   3.2
    7 0 3 1 131 1   3.2
    7 0 4 1 279 1   3.2
    7 0 0 2  60 1   3.2
    7 0 4 2 279 1   3.2
    7 0 3 2 131 1   3.2
    7 0 0 3  60 1   3.2
    7 1 3 3 131 1   3.2
    7 1 4 3 279 1   3.2
    7 0 3 4 131 1   3.2
    7 0 4 4 279 1   3.2
    7 0 0 4  60 1   3.2
    7 2 3 5 131 1   3.2
    7 1 4 5 279 1   3.2
    7 0 0 5  60 1   3.2
    7 0 0 6  60 1   3.2
    7 0 4 6 279 1   3.2
    7 0 3 6 131 1   3.2
    7 0 0 7  60 1   3.2
    7 0 3 7 131 1   3.2
    7 1 4 7 279 1   3.2
    7 1 4 8 279 1   3.2
    7 0 0 8  60 1   3.2
    7 0 3 8 131 1   3.2
    8 0 1 1  60 0   3.3
    8 0 0 1 250 0   3.3
    8 0 1 2  60 0   3.3
    8 0 0 2 250 0   3.3
    8 0 1 3  60 0   3.3
    8 0 0 3 250 0   3.3
    8 0 1 4  60 0   3.3
    8 0 0 4 250 0   3.3
    8 0 0 5 250 0   3.3
    8 1 1 5  60 0   3.3
    8 0 0 6 250 0   3.3
    8 0 1 6  60 0   3.3
    8 0 1 7  60 0   3.3
    8 0 0 7 250 0   3.3
    8 0 0 8 250 0   3.3
    8 0 1 8  60 0   3.3
    9 0 1 1 298 1     .
    9 0 0 1 123 1     .
    9 0 0 2 123 1     .
    9 0 1 2 298 1     .
    9 0 0 3 123 1     .
    9 0 1 3 298 1     .
    9 0 0 4 123 1     .
    9 0 1 4 298 1     .
    9 1 1 5 298 1     .
    9 0 0 5 123 1     .
    9 0 1 6 298 1     .
    9 0 0 6 123 1     .
    9 0 1 7 298 1     .
    9 0 0 7 123 1     .
    9 0 1 8 298 1     .
    9 0 0 8 123 1     .
    end
    label values adbb_item adbb_itemlbl
    label def adbb_itemlbl 1 "Ansigtsudtryk", modify
    label def adbb_itemlbl 2 "Øjenkontakt", modify
    label def adbb_itemlbl 3 "Generelt aktivitetsniveau", modify
    label def adbb_itemlbl 4 "Selvstimulerende adfærd", modify
    label def adbb_itemlbl 5 "Vokalisering", modify
    label def adbb_itemlbl 6 "Reaktionstid i forhold til stimulation", modify
    label def adbb_itemlbl 7 "Relation", modify
    label def adbb_itemlbl 8 "Opmærksomhedsinitiering og -fastholdelse", modify
    label values koen_b sex_blbl
    label def sex_blbl 0 "Dreng", modify
    label def sex_blbl 1 "Pige", modify
    Hope you can help me solve my problme, and sorry for any mistake I must have made in the way I have posted my question.

    Thanks in advance

    Kind regards
    Maria

  • #2
    See

    Code:
    help egen
    for its tag() function, which is intended for this purpose.

    Comment


    • #3
      For respondent's sex that is fine, and your method should work. I suspect your real problem is conceptual rather than a Stata problem: How would you create that table if the values change within a person over time?
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment

      Working...
      X