Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • "missed doctor appointment" - count distinct IDs if variable 1 is equal to something, but the following line does not exist

    Hi,

    I am looking for some help as I am not an expert. I have a patient db of 100000 observations, each one representing a Doctors visit. There are different kind of Doctors in the list (GP, cardiologist, urologist etc), which refer the patients to each other. Each patients has his/her own ID and usually has more than one consultation.

    I am very lost on trying to figure out how to calculate how many distinct patients a Doctor refers to a colleague but the patient does not show up (= the patient is referred, but the specialist never sees the patient, so there is no line for that ID regarding the specialist). I would love it if I could sort this by year and see how many GP referrals actually worked efficiently

    For example:
    patient ID Date (year) Doctor type Doctor referrel
    1 2023 GP cardiologist
    1 2023 cardiologist
    1 2024 GP
    2 2023 GP urologist
    3 2021 GP urologist
    3 2022 GP urologist
    3 2022 urologist
    In this table, I can see that patient 1 was correctly referred to the cardiologist (= the patient has seen the cardiologist). Patient 2 was referred but did not show up at the appointment with the urologist. Patient 3 did not show up with the urologist in 2021 but did show up in 2022.

    So this should give me something like this :
    2021
    patients referred from GP to specialist: 1
    patients that showed up with the specialist: 0

    2022
    patients referred from GP to specialist: 1
    patients that showed up with the specialist: 1

    2023
    patients referred from GP to specialist: 2
    patients that showed up with the specialist: 1

    2024
    patients referred from GP to specialist: 0

    Is there any way this can be done in a simple manner ?

    Thanks, any help is appreciated...

    C

  • #2
    There may be a simpler way, but this is the simplest approach that occurs to me.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear*
    input byte patient_id int year str13 visited str12 referred_to
    1 2023 "gp"           "cardiologist"
    1 2023 "cardiologist" ""            
    1 2024 "gp"           ""            
    2 2023 "gp"           "urologist"   
    3 2021 "gp"           "urologist"   
    3 2022 "gp"           "urologist"   
    3 2022 "urologist"    ""            
    end
    
    frame put patient_id year referred_to if !missing(referred_to), into(referrals)
    frame change referrals
    frlink 1:1 patient_id year referred_to, frame(default patient_id year visited)
    gen byte completed_referral = !missing(default)
    
    by year completed_referral (patient_id), sort: gen patients ///
        = sum(patient_id != patient_id[_n-1])
    by year completed_referral (patient_id): keep if _n == _N
    by year (completed_referral): egen n_referrals = total(patients)
    by year (completed_referral): egen n_completed_referrals ///
        = total(cond(completed_referral, patients, .))
    by year (completed_referral): keep if _n == _N
    keep year n_*
    
    frame change default
    keep year
    duplicates drop
    frlink 1:1 year, frame(referrals)
    frget _all, from(referrals)
    foreach v of varlist n_* {
        replace `v' = 0 if missing(`v')
    }
    drop referrals
    isid year, sort
    Note: I have renamed your variables so that they remind me of what they represent. You can go back over this code and replace all of these with the names you originally chose if that is more convenient for you. Also, for this code to work correctly, the values of the variables visited (your "Doctor type") and referred_to (your Doctor referrel [sic]) must be carefully typed to assure correct matching. In particular, "gp " will not match with "gp", nor with "gp ", nor with "GP". Make sure that you are very consistent about the use of leading, trailing or embedded blank spaces and upper/lower case with those two variables. The Stata functions -trim()-, -itrim()-, and -lower()- or -upper()- may be helpful in this regard.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.




    Comment


    • #3
      What if the same patient is referred to two or more specialists in the same year? Do you double count?

      Comment


      • #4
        Andrew Musau raises a good question. I interpreted the problem as not double counting, based on "count distinct IDs" in the title--and that is how the code in #2 handles it. But the text of the post as a whole also says "I would love it if I could sort this by year and see how many GP referrals actually worked efficiently," which implies that referrals should double count patients.

        Comment


        • #5
          Originally posted by Andrew Musau View Post
          What if the same patient is referred to two or more specialists in the same year? Do you double count?
          This is a good point but I think right now even if I do not double count it's fine as it will allready give me an idea of what is going on ! We have lots of patients that are lost and no idea when this happens, so it's mainly to get an itial idea but I can then think of something if there are more specialists !

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            There may be a simpler way, but this is the simplest approach that occurs to me.

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear*
            input byte patient_id int year str13 visited str12 referred_to
            1 2023 "gp" "cardiologist"
            1 2023 "cardiologist" ""
            1 2024 "gp" ""
            2 2023 "gp" "urologist"
            3 2021 "gp" "urologist"
            3 2022 "gp" "urologist"
            3 2022 "urologist" ""
            end
            
            frame put patient_id year referred_to if !missing(referred_to), into(referrals)
            frame change referrals
            frlink 1:1 patient_id year referred_to, frame(default patient_id year visited)
            gen byte completed_referral = !missing(default)
            
            by year completed_referral (patient_id), sort: gen patients ///
            = sum(patient_id != patient_id[_n-1])
            by year completed_referral (patient_id): keep if _n == _N
            by year (completed_referral): egen n_referrals = total(patients)
            by year (completed_referral): egen n_completed_referrals ///
            = total(cond(completed_referral, patients, .))
            by year (completed_referral): keep if _n == _N
            keep year n_*
            
            frame change default
            keep year
            duplicates drop
            frlink 1:1 year, frame(referrals)
            frget _all, from(referrals)
            foreach v of varlist n_* {
            replace `v' = 0 if missing(`v')
            }
            drop referrals
            isid year, sort
            Note: I have renamed your variables so that they remind me of what they represent. You can go back over this code and replace all of these with the names you originally chose if that is more convenient for you. Also, for this code to work correctly, the values of the variables visited (your "Doctor type") and referred_to (your Doctor referrel [sic]) must be carefully typed to assure correct matching. In particular, "gp " will not match with "gp", nor with "gp ", nor with "GP". Make sure that you are very consistent about the use of leading, trailing or embedded blank spaces and upper/lower case with those two variables. The Stata functions -trim()-, -itrim()-, and -lower()- or -upper()- may be helpful in this regard.

            In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.



            Thank you very much !

            I tried testing the code but actually got stuck at the very beginning as it does not seem to recognise any frame command. I am not sure if this is my STATA version (15) and cannot seem to find anything to download it.... you can tell I'm really a beginner !
            Last edited by Chiara De Pascalis; 08 Jun 2024, 16:19.

            Comment


            • #7
              Yes, Stata introduced frames in version 16. Please read the Forum FAQ for guidance about effective posting in this Forum. You will see there that it says that if you are not using the current version of Stata (which, as of this writing, is 18) you should specify what version you are using.

              So that code will not run in version 15, and there is nothing you can do by way of updating your Stata for that--you would have to upgrade to version 16 or later.

              Here's a different version of the code that does not require frames:
              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear*
              input byte patient_id int year str13 visited str12 referred_to
              1 2023 "gp"           "cardiologist"
              1 2023 "cardiologist" ""            
              1 2024 "gp"           ""            
              2 2023 "gp"           "urologist"   
              3 2021 "gp"           "urologist"   
              3 2022 "gp"           "urologist"   
              3 2022 "urologist"    ""            
              end
              
              tempfile original
              save `original'
              
              keep patient_id year referred_to
              keep if !missing(referred_to)
              rename referred_to visited
              merge 1:1 patient_id year visited using `original', keepusing(visited) keep(match master)
              gen byte completed_referral = _merge == 3
              drop _merge
              
              by year completed_referral (patient_id), sort: gen patients ///
                  = sum(patient_id != patient_id[_n-1])
              by year completed_referral (patient_id): keep if _n == _N
              by year (completed_referral): egen n_referrals = total(patients)
              by year (completed_referral): egen n_completed_referrals ///
                  = total(cond(completed_referral, patients, .))
              by year (completed_referral): keep if _n == _N
              keep year n_*
              tempfile results
              save `results'
              
              use `original', clear
              keep year
              duplicates drop
              merge 1:1 year using `results'
              mvencode n_*, mv(0) override
              isid year, sort

              Comment


              • #8
                Chiara:
                as an aside to other excellent advice, I would delve into the missingness mechanism (that is, is it ignorable or not?).
                Is it reasonable to guess that patients simply forgot to show up?
                Or, if your data refer to the Italian setting, did the patient, annoyed by the impicit rationing applied by the Italian National Health Service via endless waiting lists, decide to go privately out-of-pocket, as it too frequently happens in Italy as far as outpatient visits and tests are concerned?
                Kind regards,
                Carlo
                (StataNow 18.5)

                Comment

                Working...
                X