
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • something

    I am an R user trying (and failing) to do something simple in STATA.

    I have long form toy data set (see below) where I want to add a numeric variable for each unique combination of `row` and `OPCSDATE_`. However, I would like the numeric variable to start from 1 for unique row value.

    * Example generated by -dataex-. For more info, type help dataex
    input float row str2 n str9 OPCS_ int OPCSDATE_
     1 "01" "fix1"      23376
     1 "02" "left"      23376
     1 "03" "revise"    23376
     1 "04" "right"     23376
     2 "01" "fix1"      23385
     2 "02" "right"     23385
     3 "01" "cut"       23376
     3 "02" "catheter"  23376
     4 "01" "rev1"      23378
     4 "02" "fix1"      23378
     4 "03" "left"      23378
     4 "04" "fix"       23379
     4 "05" "right"     23379
     5 "01" "abx1"      23379
     6 "01" "decomp1"   23380
     6 "02" "left"      23380
     7 "01" "abx1"      23381
     8 "01" "fix1"      23382
     8 "02" "rev1"      23382
     8 "03" "right"     23383
    10 "01" "med1"      23384
    10 "02" "hemi"      23384
    10 "03" "cut"       23379
    10 "04" "bilateral" 23379
    11 "01" "cast1"     23385
    11 "02" "left"      23385
    format %tdnn/dd/CCYY OPCSDATE_

    I have used

    egen float OPCS_dateGrp = group(r OPCSDATE_)
    to generate a new variable, but the numeric count does not reset to 1 row each sequential row value.

    This is probably quite basic but I have had no luck searching for a solution and would appreciate any help.
    Last edited by Josh Lamb; 13 Jun 2024, 23:50.

  • #2
    Solution found: Workaround for Lack of Egen Group, by() - Statalist

