Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Two arguments in the "rank" function of egen

    Hi friends,

    I have data about parents and their children, it is build like that:

    id_parent - parent identifier
    birth_year - the birth year of every one of his children
    id_child - child identifier

    So the number of observation that will get certain value of the variable "id_parent" is the number of children this parent have. Every value of id_child will appear twice - one when he or she connected to his/her father ID and one for his/her mother ID.

    Now I want to rank the children according to their year of birth, so I wrote the next code:
    Code:
    bysort id_parent: egen sibiling_order = rank(birth_year), unique
    The trouble is that in case of twins it might create situation that one twin is ranked 1 for his father and 2 for his mother and vice versa. It is problem for me when I later want to keep only couples that had their first child together.
    So I tried the next code:
    Code:
    bysort id_parent: egen sibiling_order = rank(birth_year id_child), unique
    and I thought it will solve it since it suppose to rank them initially by the birth year and if there are equalizer than by the id_child, but I got an error message that says "birth_yearid_child invalid name", and it made me thought maybe the function rank can use only one variable.

    So does it true? and maybe someone can find me way to solve my problem?

    Thank you already,

    FitzGerald

  • #2

    would this work? bysort id_parent (id_child): egen sibiling_order = rank(birth_year), unique

    Comment


    • #3
      Originally posted by George Ford View Post
      would this work? bysort id_parent (id_child): egen sibiling_order = rank(birth_year), unique
      Thank you George, I think it does.

      I'm checking it deeply now and I will update you.

      Fitz

      Comment


      • #4
        Originally posted by George Ford View Post
        would this work? bysort id_parent (id_child): egen sibiling_order = rank(birth_year), unique
        Hi George,

        I'm sorry but it doesn't work. The first example for twins I find was ranked oppositely for every parent..

        Do you another idea?

        Fitz

        Comment


        • #5
          Data example please!

          Comment


          • #6
            Originally posted by Nick Cox View Post
            Data example please!
            I'm sorry but I work on limited administrative data, on a restricted computer so I can't attach you anything from there.
            I only can describe it theoretically from my PC.

            Sorry,

            Fitz

            Comment


            • #7
              It is fine by me that you can't (shouldn't) post confidential data.

              All we want to see are faked realistic data or even faked silly data sufficient to show the problem, as explained in our FAQ Advice.

              Otherwise you are being optimistic about our capacity to read a word description and understand it (I am pretty poor at that myself)

              -- or you are expecting that we read your story and then invent a data example to show you some code (I am quite good at Stata code but I often make very silly mistakes without having a data example to play with).

              If you can type posts on Statalist you can type out data examples (and may even be able to copy and paste into Statalist).

              Comment


              • #8
                I think you are framing this discussion in a problematic way, but maybe I haven't fully understood the problem.

                There are some important aspects that seem to have been omitted in your description.
                • There is no mention of scenarios with single-parent adoptions, same-sex couples, or single-mothers, so I assume that you will not encounter them.
                • You mention twins in your dataset, but if all you have is birth year, how would you separately identify and rank true (monozygotic or dyzygotic) twins as opposed to two or more siblings born in the same year. And what about triplets, or other multiple births?
                • What about divorce and remarriage resulting in half-siblings or blended families?
                Based on your description, you have either two separate datasets, one for the women and the other for the men. Or perhaps it's one dataset in a long layout the one row per parent-child dyad. Do you have information linking parents together into dyads? In any case, this seems like an odd data representation.

                What precisely is the goal of this new rank variable? Are you trying to identify birth order within parent? Or birth order within couple?

                I would consider whether you could first arrange the data into triads, with one observation per child, parent 1 and parent 2. I can't offer further help without some example data to work with and some clarity relating to the issues above.

                Comment

                Working...
                X