Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fraction By Group

    Dear Stata Users;
    I have cross sectional data constructed by combining two waves of surveys. The educational attainment variable is coded from 1 to 5 where 4 corresponds to completion of secondary schooling and 5 corresponds to attainment to higher education. I am trying to find the fraction of people completed at least secondary schooling (Educ > 4) for each birth date. Briefly, I have Educ variable and Birth of Date data and I want to find fraction of people having Educ > 4 at each birth data coded according to CMC.

  • #2
    No data example here -- and I don't understand what CMC means -- but it seems easy to understand the Stata code implied what you want. The fraction who did or are something is just the mean of an indicator 1 for those qualifying and 0 for those not qualifying.

    The main thing to watch out for is missing values, which should be ignored.

    Here is some technique:

    Code:
    . sysuse auto, clear
    (1978 Automobile Data)
    
    . egen wanted = mean(inrange(rep78, 4, 5) & rep78 < .) , by(foreign)
    
    . count if inrange(rep78, 4, 5)  & foreign == 1
      18
    
    . count if rep78 < .  & foreign == 1
      21
    
    . di 18/21
    .85714286
    
    . egen wanted = mean(cond(rep78 < ., inrange(rep78, 4, 5), .)) , by(foreign)
    
    . tabdisp foreign, c(wanted)
    
    ----------------------
     Car type |     wanted
    ----------+-----------
     Domestic |   .2291667
      Foreign |   .8571429
    ----------------------

    Comment


    • #3
      Dear Nick, Thank you for your response. The CMC code for month based birth date is the following 12*(Year-1900)+ Month. I have thousands of CMC based birth date. Therefore practically, it is almost impossible to code them 1 and 0 and pursue the technique you kindly put down in the message. I am just wondering if there is shortcut without getting dummies involved.

      Comment


      • #4
        Still no data example and you give no code either. But CMC is evidently just Stata monthly date + 720. I don't see what's "practically impossible" about using something like

        Code:
        egen wanted = mean(cond(Educ < ., (Educ > 4), .)) , by(CMC)

        Comment


        • #5
          How do you plot the same? wanted is already created using the by variable.

          So, say for example I need to plot the same using a twoway graph- on the vertical axis: with the fraction of individuals completed at least secondary schooling (Educ > 4) & horozontal axis: CMC or birthyear or birthmonth whatever is required:

          Then do we do:

          twoway connected wanted CMC

          IS that right? I'm confused because wanted variable is already grouped. Then while plotting how do we refer to the horizontal axis?

          Comment

          Working...
          X