Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Roommates' max expenditure with parents' educations

    Dear All, I was asked this question here. The data set is
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(id exp) long roomnumber byte(feduc meduc)
    53 1700 105111 4 5
    43 1800 105111 8 3
    57 1500 105211 5 6
    56 2000 105211 5 6
    60 2100 105211 3 3
    58 1321 105211 4 4
    63 2500 105211 7 7
    59  900 105212 6 5
    62 1200 105212 6 3
    72 1200 105212 5 7
    end
    Firstly, for each room (`roomnumber'), there are alternative numbers of roommates (with different `id'). We want to obtain a new variable for each roommate, say `max_exp', which is the maximum of `exp' in each `roomnumber', excluding himself. I have done this by ( ssc install asrol)
    Code:
    bys roomnumber: asrol exp, stat(max) xf(focal)
    with result as
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(id exp) long roomnumber byte(feduc meduc) double max_exp
    53 1700 105111 4 5 1800
    43 1800 105111 8 3 1700
    57 1500 105211 5 6 2500
    56 2000 105211 5 6 2500
    60 2100 105211 3 3 2500
    58 1321 105211 4 4 2500
    63 2500 105211 7 7 2100
    59  900 105212 6 5 1200
    62 1200 105212 6 3 1200
    72 1200 105212 5 7 1200
    end
    Secondly, I want to generate two additional variable, say `feduc1' and`meduc1', which is the `feduc' and`meduc' from the one with maximum expenditure. Taking the average of `feduc' and`meduc' if there are ties in the maximum expenditures.Any suggestions? Thanks.
    Ho-Chuan (River) Huang
    Stata 17.0, MP(4)

  • #2
    Excellent data example that tests the tricky things like ties. Starting from the results with max_exp included, I think this does what you seek.
    Code:
    bysort roomnumber (max_exp): generate double maxmax = max_exp[_N]
    bysort roomnumber: egen feduc1 = mean(cond(max_exp==maxmax,feduc,.))
    bysort roomnumber: egen meduc1 = mean(cond(max_exp==maxmax,meduc,.))
    list, noobs sepby(roomnumber) abbreviate(12)
    Code:
    . list, noobs sepby(roomnumber) abbreviate(12)
    
      +-------------------------------------------------------------------------------+
      | id    exp   roomnumber   feduc   meduc   max_exp   maxmax     feduc1   meduc1 |
      |-------------------------------------------------------------------------------|
      | 43   1800       105111       8       3      1700     1800          4        5 |
      | 53   1700       105111       4       5      1800     1800          4        5 |
      |-------------------------------------------------------------------------------|
      | 63   2500       105211       7       7      2100     2500       4.25     4.75 |
      | 58   1321       105211       4       4      2500     2500       4.25     4.75 |
      | 56   2000       105211       5       6      2500     2500       4.25     4.75 |
      | 57   1500       105211       5       6      2500     2500       4.25     4.75 |
      | 60   2100       105211       3       3      2500     2500       4.25     4.75 |
      |-------------------------------------------------------------------------------|
      | 72   1200       105212       5       7      1200     1200   5.666667        5 |
      | 59    900       105212       6       5      1200     1200   5.666667        5 |
      | 62   1200       105212       6       3      1200     1200   5.666667        5 |
      +-------------------------------------------------------------------------------+

    Comment


    • #3
      Dear William, Thanks for the reply. This is close but not exactly what I had in mind.
      1. In fact, I have obtained the maximum expenditure in max_exp (excluding himself in the same `roomnumber'. Thus, there is no need for your `maxmax'.
      2. The expected result, say for id = 63, the max_exp=2100. Thus, the `feduc1' should be equal to 3 and `meduc1' is 3 as well.
      Any further comments?
      Ho-Chuan (River) Huang
      Stata 17.0, MP(4)

      Comment


      • #4
        Originally posted by River Huang View Post
        Dear William, Thanks for the reply. This is close but not exactly what I had in mind.
        1. In fact, I have obtained the maximum expenditure in max_exp (excluding himself in the same `roomnumber'. Thus, there is no need for your `maxmax'.
        2. The expected result, say for id = 63, the max_exp=2100. Thus, the `feduc1' should be equal to 3 and `meduc1' is 3 as well.
        Any further comments?
        Dear River, I tried to solve this question with not that elegant codes. Is this what you want?[ I strongly believe there would be a smarter way to deal with this]
        Code:
        clear
        input int(id exp) long roomnumber byte(feduc meduc)
        53 1700 105111 4 5
        80 1700 105111 1 2
        43 1800 105111 8 3
        58 1321 105211 4 4
        57 1500 105211 5 6
        56 2000 105211 5 6
        60 2100 105211 3 3
        63 2500 105211 7 7
        59  900 105212 6 5
        62 1200 105212 6 3
        72 1200 105212 5 7
        end
        bys roomnumber: asrol exp, stat(max) xf(focal)
        bysort roomnumber (max_exp): generate double maxmax = max_exp[_N]
        gen max1=1 if maxmax==exp
        bysort roomnumber (max_exp): generate double max2max = max_exp[1]
        gen max2=1 if max2max==exp
        
        bys roomnumber max1: egen feduc_mean1=mean(feduc)
        replace feduc_mean1=. if missing(max1)
        bys roomnumber: replace feduc_mean1=feduc_mean1[1]
        bys roomnumber max1: egen meduc_mean1=mean(meduc)
        replace meduc_mean1=. if missing(max1)
        bys roomnumber: replace meduc_mean1=meduc_mean1[1]
        
        bys roomnumber max2: egen feduc_mean2=mean(feduc)
        replace feduc_mean2=. if missing(max2)
        bys roomnumber: replace feduc_mean2=feduc_mean2[1]
        bys roomnumber max2: egen meduc_mean2=mean(meduc)
        replace meduc_mean2=. if missing(max2)
        bys roomnumber: replace meduc_mean2=meduc_mean2[1]
        
        gen fedc_final=feduc_mean1
        replace fedc_final=feduc_mean2 if max1==1
        gen medc_final=meduc_mean1
        replace medc_final=meduc_mean2 if max1==1
        drop maxmax-meduc_mean2
        2B or not 2B, that's a question!

        Comment


        • #5
          -rangestat- gives out an 1-line solution.
          Code:
          *rangestat (max) max_exp = exp, interval(id . .) by(roomnumber) excludeself
          rangestat (mean) feduc1 = feduc meduc1 = meduc, interval(exp max_exp max_exp) by(roomnumber)

          Comment


          • #6
            Although I definitely prefer Romalpa's solution, it can also be done in this way.
            Code:
            gen feduc1=.
            levelsof roomnumber, local(levels)
            foreach l of local levels{
                qui forval i = 1/`=_N' {
                    su feduc if exp == max_exp[`i'] & roomnumber==`l'
                    replace feduc1 = r(mean) in `i' if roomnumber==`l'
                    }
            }
            
            gen meduc1=.
            levelsof roomnumber, local(levels)
            foreach l of local levels{
                qui forval i = 1/`=_N' {
                    su meduc if exp == max_exp[`i'] & roomnumber==`l'
                    replace meduc1 = r(mean) in `i' if roomnumber==`l'
                    }
            }
            Last edited by Wouter Wakker; 30 Jun 2019, 07:53.

            Comment


            • #7
              Dear Romalpa, Many thanks for this amazing solution.
              Ho-Chuan (River) Huang
              Stata 17.0, MP(4)

              Comment


              • #8
                Dear Wouter, Thanks for this helpful suggestion.
                Ho-Chuan (River) Huang
                Stata 17.0, MP(4)

                Comment

                Working...
                X