Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with finding max value across variables

    Hi everyone, I am looking for some help comparing variable values across DONOR_ID

    I have 9 variables, (turndownlogistics, turndowndonororgan, turndowndonorproblems, turndownmismatch, turndownmisc, turndownrecipientproblems, turndowndisease, turndownunknown, turndownpreservation)
    these variables are the number of times an organ was turned down for that reason

    For each DONOR_ID i would like to know what is the most common reason for discard

    I was trying this plan:
    1. bysort DONOR_ID: egen mostcommon = max(turndownlogistics, turndowndonororgan, turndowndonorproblems, turndownmismatch, turndownmisc, turndownrecipientproblems, turndowndisease, turndownunknown, turndownpreservation)
    2. gen mostcommon
    3. replace mostcommondiscard ="logistics" if mostcommon==turndownlogistics
    replace mostcommondiscard ="donororgan" if mostcommon==turndowndonororgan
    ect.

    I keep getting an error that says "invalid name" when i enter the code in step 1

    I was also wondering if anybody has any idea how to deal with ties in this senario?

    Thank you so much

  • #2
    Code:
    bysort DONOR_ID: egen mostcommon = rowmax(turndownlogistics turndowndonororgan turndowndonorproblems turndownmismatch turndownmisc ///
        turndownrecipientproblems turndowndisease turndownunknown turndownpreservation) // N.B. No commas!
    Also, if the variables mentioned here are the only variables in your dataset whose names begin with turndown, you can simplify this to
    Code:
    bysort DONOR_ID: egen mostcommon = rowmax(turndown*)
    On further thought, that will give you the value of the number of times that the most frequent reason for turndown occurred, but will not tell you which reason it was. To do that, a different approach is needed. Here I will assume that these variables are the only ones whose names begin with turndown and that there is at most only one observation for any DONOR_ID:

    Code:
    reshape long turndown, i(DONOR_ID) j(reason) string
    drop if missing(turndown)
    sort DONOR_ID turndown
    replace reason = reason + " "
    
    gen commonest_reasons = ""
    by DONOR_ID: replace commonest_reasons = commonest_reasons[_n-1] ///
        + reason if turndown == turndown[_N]
    by DONOR_ID: replace commonest_reasons = commonest_reasons[_N]
    
    replace reason = trim(reason)
    replace commonest_reasons = trim(commonest_reasons)
    
    reshape wide
    Notes:
    1. As no example data was provided, this code is untested. Beware of typos or other errors.
    2. You do not say what you want to do if two or more reasons for being turned down are tied for most frequent. This code provides a variable naming all of the tied reasons.
    Last edited by Clyde Schechter; 20 Jun 2024, 14:26. Reason: +

    Comment


    • #3
      Hi, thank you so much for your help!

      Stata seems to not like the combination

      bysort DONOR_ID: egen tester6 = rowmax(turndown*)
      egen ... rowmax() may not be combined with by
      r(190);

      Comment


      • #4
        Sorry, you are right. Remove -bysort DONOR_ID- from that command.

        I think when you wrote that, you may not have seen my edit to my response in #2. If you are going to use the code I provided after "On further thought,..." do not remove the -by DONOR_ID:- prefixes there.

        Comment


        • #5
          Note that lack of support for by: is not a problem here.

          The maximum over variables within each observation is independent of which group the observation is in on some other variable.

          No one is missing any functionality -- unless what you want is the maximum done two ways, over variables and then over observations in a group, which is two egen function calls with official Stata, or one egen function call if you write such a function to do it.

          Comment


          • #6
            When I try to run the code:
            reshape long turndown, i(DONOR_ID) j(reason) string
            drop if missing(turndown)
            sort DONOR_ID turndown
            replace reason = reason + " "

            gen commonest_reasons = ""
            by DONOR_ID: replace commonest_reasons = commonest_reasons[_n-1] ///
            + reason if turndown == turndown[_N]
            by DONOR_ID: replace commonest_reasons = commonest_reasons[_N]

            replace reason = trim(reason)
            replace commonest_reasons = trim(commonest_reasons)

            reshape wide

            I keep getting this error: I/O error writing .dta file
            Usually such I/O errors are caused by the disk or file system being
            full.

            have you ever encountered this? Im not sure how to work around it

            Comment


            • #7
              If you are running this code on a very large data set, it may be that the intermediate files that the -reshape- command uses are filling up the available disk space. The first thing I would do is to precede the commands with -keep DONOR_ID turndown*-. This will reduce the data set to just those variables that are required for the computation. If you need to have the results in the same data file with the other variables in the original data, you can -merge- the results back with the original data.

              If that isn't sufficient to solve the problem, you can probably succeed by just processing one DONOR_ID at a time and putting those results all together. This is readily automated with the -runby- command, by Robert Picard and me, available from SSC.

              Code:
              /* AS NO EXAMPLE DATA WAS PROVIDED, CREATE A TOY DATA SET
                 TO DEMONSTRATE THE CODE
              */
              
              clear*
              set obs 100
              set seed 1234
              gen int DONOR_ID = _n
              foreach x in turndownlogistics turndowndonororgan turndowndonorproblems turndownmismatch turndownmisc ///
                  turndownrecipientproblems turndowndisease turndownunknown turndownpreservation {
                      gen `x' = runiformint(1,5)
              }
              
              
              //  SOLUTION TO PROBLEM BEGINS HERE
              capture program drop one_donor
              program define one_donor
                  reshape long turndown, i(DONOR_ID) j(reason) string
                  drop if missing(turndown)
                  sort DONOR_ID turndown
                  replace reason = reason + " "
              
                  gen commonest_reasons = ""
                  by DONOR_ID: replace commonest_reasons = commonest_reasons[_n-1] ///
                      + reason if turndown == turndown[_N]
                  by DONOR_ID: replace commonest_reasons = commonest_reasons[_N]
              
                  replace reason = trim(reason)
                  replace commonest_reasons = trim(commonest_reasons)
              
                  reshape wide
                  exit
              end
              
              keep DONOR_ID turndown*
              runby one_donor, by(DONOR_ID) status
              If the original data set is not large, however, then running out of memory will not be the cause and this code will not help you. If the original data set is not large, then most likely you either are working with a full, or nearly full, file system, or you lack write permissions for it. You cannot solve those problems from within Stata and will have to clear out disk space by removing unneeded files, or get your system administrator to give you write permissions or allocate more disk space to you.

              Comment


              • #8
                Here is a way to do it without reshaping your data. I am borrowing Clyde's code for creating a toy dataset.

                Note that the code lists multiple reasons (separated by spaces) when there are ties for the highest frequency.

                Code:
                clear*
                set obs 100
                set seed 1234
                gen int DONOR_ID = _n
                foreach x in turndownlogistics turndowndonororgan turndowndonorproblems turndownmismatch turndownmisc ///
                    turndownrecipientproblems turndowndisease turndownunknown turndownpreservation {
                        gen `x' = runiformint(1,5)
                }
                
                * SOLUTION BEGINS HERE
                
                local reasons logistics donororgan donorproblems mismatch misc recipientproblems disease unknown preservation
                local num_reasons: word count `reasons'
                
                gen most_common_reasons = "logistics"
                gen frequency = turndownlogistics
                
                gen byte is_new_max = 0
                gen byte equals_max = 0
                
                forval i = 2/`num_reasons' {
                    local reason: word `i' of `reasons'
                    replace is_new_max = (turndown`reason' > frequency)
                    replace equals_max = (turndown`reason' == frequency)
                    replace most_common_reasons = "`reason'" if is_new_max
                    replace frequency = turndown`reason' if is_new_max
                    
                    replace most_common_reasons = most_common_reasons + " " + "`reason'" if equals_max
                }
                
                drop is_new_max equals_max
                this produces (listing 20 lines as an example):

                Code:
                . list in 1/20, noobs sep(0) abbrev(11)
                
                  +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
                  | DONOR_ID   turndownl~s   turndownd~n   turndownd~s   turndownm~h   turndownm~c   turndownr~s   turndownd~e   turndownu~n   turndownp~n                                           most_common_reasons   frequency |
                  |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
                  |        1             1             3             1             5             1             1             2             5             3                                              mismatch unknown           5 |
                  |        2             2             4             1             1             2             2             4             1             5                                                  preservation           5 |
                  |        3             1             3             5             3             1             2             5             2             3                                         donorproblems disease           5 |
                  |        4             1             1             1             1             4             5             4             4             3                                             recipientproblems           5 |
                  |        5             4             1             3             5             1             1             5             5             2                                      mismatch disease unknown           5 |
                  |        6             2             1             3             4             4             4             5             4             4                                                       disease           5 |
                  |        7             1             1             1             4             3             1             5             2             4                                                       disease           5 |
                  |        8             1             1             2             3             5             3             4             5             3                                                  misc unknown           5 |
                  |        9             2             5             1             2             1             2             1             5             2                                            donororgan unknown           5 |
                  |       10             2             4             5             3             5             1             1             3             3                                            donorproblems misc           5 |
                  |       11             3             3             2             5             1             1             2             4             3                                                      mismatch           5 |
                  |       12             1             5             1             1             2             4             2             1             5                                       donororgan preservation           5 |
                  |       13             1             1             1             1             3             5             3             2             4                                             recipientproblems           5 |
                  |       14             2             4             3             5             5             4             1             5             5                            mismatch misc unknown preservation           5 |
                  |       15             1             2             2             4             4             1             2             4             5                                                  preservation           5 |
                  |       16             5             1             2             2             5             3             1             2             1                                                logistics misc           5 |
                  |       17             5             5             3             2             1             5             5             4             5   logistics donororgan recipientproblems disease preservation           5 |
                  |       18             4             2             4             4             3             1             2             1             1                              logistics donorproblems mismatch           4 |
                  |       19             3             4             5             4             1             5             3             3             4                               donorproblems recipientproblems           5 |
                  |       20             5             4             5             2             1             4             2             5             1                               logistics donorproblems unknown           5 |
                  +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

                Comment


                • #9
                  Hi everyone, thank you so much for your help! I was able to use your codes to work through my problem!

                  Comment

                  Working...
                  X