Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • sttocc matching for both analysis time and calendar time - nested case control study

    Hi folks,

    Apologies in advance for the rookie question!
    I am designing a nested case-control study using electronic health data and would like some help with matching using the stset + sttocc command.
    The cohort data is a dynamic cohort study with individuals entering and exiting the study at different dates (they have different start dates).
    My failure event is the development of a disease (inflammatory bowel disease).

    I have worked out how to match for "calendar time" and separately for "analysis time" using stset and sttocc.

    For calendar time:
    stset end, failure(event) enter(start) origin(date - in my study 01,01,2000) id(patid) scale(365.25)
    set seed 9123456
    sttocc, n(6) nodots

    For analysis time:
    stset end, failure(event) enter(start) origin(start) id(patid) scale(365.25)
    set seed 9123456
    sttocc, n(6) nodots

    However, I would like my controls to be matched not only for analysis time but additionally for calendar time (i.e. each control has the same (or greater) follow up as their matched case over the same calendar period). Is there a way to do this?

    Many thanks,

    Tommy.

  • #2
    In the many threads that have appeared on StataList about case-control matching, I don't recall people using -stset- or -sttocc-, although those might might work. For you to get better advice here, I think answers to the following questions would be helpful:

    1) How many cases and potential controls do you have, and how many controls do you want to match to each case? (This will affect the computational feasibility of some solutions.)

    2) Exactly what kind of sampling of controls do you want:
    a) Do you want to sample controls from persons at risk at the time a case became a case (incidence density sampling)?
    b) Do you want to sample controls with or without replacement?

    3) Can you explain your matching on "analysis time" a bit more? In particular, what do you mean by "...over the same calendar period?" To my way of thinking, presuming incidence density sampling, a case and control might only share a "calendar period" at one point in time. I wonder if you have in mind some other kind of risk set matching than I might be thinking of.

    My apologies if my questions are off-base--I know enough here to not be dangerous, but not a lot more <grin>.

    And, a search on /site:statalist.org match case control/ might reveal some things that would help you.

    Comment


    • #3
      Wouldnt simply maching on calendar year of entry after you have stset on analysis time be sufficient? You could add a range (+/- 1 year) if you like. Thus use sttoc after stset and create a pool of controls for each case and then from that pool select controls with the same calendar year.

      .stset end, failure(event) enter(start) origin(start) id(patid) scale(365.25)
      .set seed 9123456


      .sttocc, n(9999) nodots //creates pool of eligible controls (I put large number here; you want all eligible controls)
      .bysort _set (_case): gen calyrdiff = abs(calyr[_N] - calyr) //calculates difference with case; note case is last observation
      .bysort _set (_case): gen eligible = 1 if inrange(calyrdiff, 0,1) //+ or - 1 year difference
      .keep if _case==1 | eligible==1

      //now select 6 controls (and keep the case) from pool of eligible controls:
      gen u =runiform()
      replace u = 0 if _case==1
      bysort _set (_case u): keep if _n<=7

      Comment


      • #4
        Thanks Raoul Reulen

        I have done a similar thing but the other way around. I generated a cohort entry year variable like this:

        gen entry_year = (start/365.25) + 1960
        egen year = cut(entry_year), at (2000 (1) 2019)

        then matched on year:
        sttocc, match(year) n(6) nodots

        Seemed to work fine!

        Comment

        Working...
        X