Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate years of education in this case?

    Dear all,

    I am trying to generate years of education variable. Although I have seen some relevant topics in this forum, I think my case is quite different so I make a new thread.

    In my data context, primary school take 5 years, secondary school takes 4 years, high school takes 3 years, College takes 2-3 years (2 years, on average), University normally takes 4 years, Master takes 2 years, and finally PhD takes 3-4 years (3 years, on average). The school starting age is at 6, so an individual completes his/her primary school at the age of 11 (6+5), regardless of repetition and later entry which I assume here. Similarly an individual would complete his/her secondary at the age of 15 (6+5+4), and so on.

    My dataset contains 3 variables related to an individual's education as follows:
    wb3 - Ever attended school
    wb4 - highest level of school attended
    wb5 - highest grade completed at that level

    I wish to provide a data example using dataex as I usually did when posting, however, since there are a few people never attended school, I attached an excel file of my data to show a whole picture of my data structure.

    The followings are simple descriptions of the three variables
    Code:
    -> tabulation of wb3  
    
           ever |
       attended |
         school |      Freq.     Percent        Cum.
    ------------+-----------------------------------
            yes |      9,254       94.17       94.17
             no |        573        5.83      100.00
    ------------+-----------------------------------
          Total |      9,827      100.00
    
    -> tabulation of wb4  
    
       highest level of school |
                      attended |      Freq.     Percent        Cum.
    ---------------------------+-----------------------------------
                     preschool |          8        0.09        0.09
                       primary |      1,459       15.77       15.85
               lower secondary |      3,370       36.42       52.27
               upper secondary |      2,412       26.06       78.33
           professional school |        574        6.20       84.54
    college/university & above |      1,431       15.46      100.00
    ---------------------------+-----------------------------------
                         Total |      9,254      100.00
    
    -> tabulation of wb5  
    
         highest |
           grade |
    completed at |
      that level |      Freq.     Percent        Cum.
    -------------+-----------------------------------
               0 |        462        6.38        6.38
               1 |         76        1.05        7.43
               2 |        198        2.73       10.16
               3 |        267        3.69       13.85
               4 |        302        4.17       18.02
               5 |        597        8.24       26.27
               6 |        435        6.01       32.27
               7 |        548        7.57       39.84
               8 |        472        6.52       46.36
               9 |      1,774       24.50       70.86
              10 |        453        6.26       77.12
              11 |        407        5.62       82.74
              12 |      1,249       17.25       99.99
         missing |          1        0.01      100.00
    -------------+-----------------------------------
           Total |      7,241      100.00
    Any help is highly appreciated.

    Thank you.

    Attached Files

  • #2
    I wish to provide a data example using dataex as I usually did when posting, however, since there are a few people never attended school, I attached an excel file of my data to show a whole picture of my data structure.
    I don't know why this would be a reason for not using -dataex-, but even if I could imagine the reasoning, an Excel file is pretty much useless for this purpose. Even if someone is willing to download it and put their computer at risk, it won't contain the metadata needed to write the code. Just seeing how it looks to human eyes is not helpful. Please post back using -dataex-. If the problem is that the first 100 observations (which is what -dataex- outputs by default) are just people who never attended school, you can use -if- and -in- conditions on -dataex- to select a more helpful example than that.

    In addition, you need to give a clearer explanation of how this is supposed to work. How are we supposed to know if a person has undertaken a master's degree of PhD--nothing in the variables that you have shown tabulations of give that information. And what is the relevance of the information you provided about typical ages at certain milestones if there is no age variable? Or even if there were an age variable, how would it be relevant?

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      In addition, you need to give a clearer explanation of how this is supposed to work. How are we supposed to know if a person has undertaken a master's degree of PhD--nothing in the variables that you have shown tabulations of give that information.
      I think that that is the primary problem Mathew has; If I interpret his question correctly he does not have a Stata problem, but does not know how the variables he has exactly relate to the variable he wants. That is a very common puzzle we all have to deal with. Almost always the answer is that we need an imperfect approximation. Which one depends on the finer details of the data, the exact questions asked, the finer details of the educational system, and finally the finer details of how you want to use that variable. This means that we cannot help you with that. Once you have solved that problem, you are 95% done. We can help you with the remaining 5%, if you need to.

      Originally posted by Clyde Schechter View Post
      And what is the relevance of the information you provided about typical ages at certain milestones if there is no age variable? Or even if there were an age variable, how would it be relevant?
      This is a very common mistake for people who are dealing with education variables from their own country for the first time; They think of level of education as "typical age" instead "years of education". There are special circumstances when typical age may matter, but I suspect that they don't apply here.

      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • #4
        Dear Prof. Clyde and Dr. Maarten,

        Thank you for your useful comments and questions. I have been thinking about this issue for a while and that is the reason for my delayed response to your questions. I think I figured out how to generate a years of schooling variable. For example, individuals have never attended school will surely have 0 year of education. Those who have attended primary school (wb4==primary school) will have 6 years of schooling if they had the highest grade completed at this level (wb5==6) or 5 years of schooling if wb5==5. A similar logic applies to other educational level.

        @Prof. Clyde: 1) thanks for your suggestion on the use of dataex and avoidance of enclosing an excel file. I will keep in mind for future posts. 2) you are right, age seems unrelated in this context and I just provide that information for a completed picture of the education system in my study context.

        Comment


        • #5
          Originally posted by Matthew Williams View Post
          Dear all,

          I am trying to generate years of education variable. Although I have seen some relevant topics in this forum, I think my case is quite different so I make a new thread.

          In my data context, primary school take 5 years, secondary school takes 4 years, high school takes 3 years, College takes 2-3 years (2 years, on average), University normally takes 4 years, Master takes 2 years, and finally PhD takes 3-4 years (3 years, on average). The school starting age is at 6, so an individual completes his/her primary school at the age of 11 (6+5), regardless of repetition and later entry which I assume here. Similarly an individual would complete his/her secondary at the age of 15 (6+5+4), and so on.

          My dataset contains 3 variables related to an individual's education as follows:
          wb3 - Ever attended school
          wb4 - highest level of school attended
          wb5 - highest grade completed at that level

          I wish to provide a data example using dataex as I usually did when posting, however, since there are a few people never attended school, I attached an excel file of my data to show a whole picture of my data structure.
          In addition, at the university I have some difficulties with writing assignments and solving problems and assignments in mathematics. I sometimes use the services of the resource https://assignmentbro.com/us/math-assignment-help where I find professional help in mathematics. Professional writers with extensive experience help me complete my math assignments.
          The followings are simple descriptions of the three variables
          Code:
          -> tabulation of wb3
          
          ever |
          attended |
          school | Freq. Percent Cum.
          ------------+-----------------------------------
          yes | 9,254 94.17 94.17
          no | 573 5.83 100.00
          ------------+-----------------------------------
          Total | 9,827 100.00
          
          -> tabulation of wb4
          
          highest level of school |
          attended | Freq. Percent Cum.
          ---------------------------+-----------------------------------
          preschool | 8 0.09 0.09
          primary | 1,459 15.77 15.85
          lower secondary | 3,370 36.42 52.27
          upper secondary | 2,412 26.06 78.33
          professional school | 574 6.20 84.54
          college/university & above | 1,431 15.46 100.00
          ---------------------------+-----------------------------------
          Total | 9,254 100.00
          
          -> tabulation of wb5
          
          highest |
          grade |
          completed at |
          that level | Freq. Percent Cum.
          -------------+-----------------------------------
          0 | 462 6.38 6.38
          1 | 76 1.05 7.43
          2 | 198 2.73 10.16
          3 | 267 3.69 13.85
          4 | 302 4.17 18.02
          5 | 597 8.24 26.27
          6 | 435 6.01 32.27
          7 | 548 7.57 39.84
          8 | 472 6.52 46.36
          9 | 1,774 24.50 70.86
          10 | 453 6.26 77.12
          11 | 407 5.62 82.74
          12 | 1,249 17.25 99.99
          missing | 1 0.01 100.00
          -------------+-----------------------------------
          Total | 7,241 100.00
          Any help is highly appreciated.

          Thank you.
          I am also interested in solving the problems. Thank you for your inquiry.

          Comment


          • #6
            Originally posted by Barc View Post

            I am also interested in solving the problems. Thank you for your inquiry.
            Are you asking a question? If so can you tell us what you did not understand from the answers?
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment

            Working...
            X