Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with the reshape command when working with household database

    . reshape wide dag lhw dgn dwt dru deh drgn1 drgn2 yem yse boa_d boa les_r les, i(idhh) j(relationship) string

    (j = .. Conyuge Hijo1 Hijo10 Hijo2 Hijo3 Hijo4 Hijo5 Hijo6 Hijo7 Hijo8 Hijo9 Jefe(a) Nieto1 Nieto10 Nieto11 Nieto12 Nieto13 Nieto14 Nieto15 Nieto2 Nieto3 Nieto4 Nieto5 Nieto6 Nieto7 Nieto8 Nieto9 Nuera/Yerno Padres)
    values of variable relationship not unique within idhh
    Your data are currently long. You are performing a reshape wide. You specified i(idhh) and
    j(relationship). There are observations within i(idhh) with the same value of j(relationship). In the
    long data, variables i() and j() together must uniquely identify the observations.
    long wide
    i j a b i a1 a2 b1 b2
    < -- ---> -- ---- --reshape --- --
    1 1 1 2 1 1 3 2 4
    1 2 3 4 2 5 7 6 8
    2 1 5 6 +-- ---- -- --- --
    2 2 7 8
    Last edited by enrique labrada; 05 Feb 2024, 17:35.

  • #2
    It's not surprising you encountered this problem. When you do a -reshape wide-, the variable(s) specified in -i()- combined with the single variable specified in -j()- must always uniquely identify observations in your data set. In your specific situation, this means that any given combination of idhh and relationship must appear only once in the entire data set. Otherwise put, there can be at most one instance of any particular relationship in a single idhh. I'm guessing that idhh refers to a household identifier. While it seems that the purveyors of the survey anticipated that there can be many sons in a household, there is no numbering scheme to distinguish sons/daughters-in-law. There is no reason a household can't have more than one son-in-law or more than one daughter-in-law. And that will break the -reshape-. Similar considerations apply to Padres.

    Now, of course, it is also possible that your problem is not due to multiple in-laws. Perhaps you have a true error, such as two sons both listed as relationship Hijo1 in the same hhid.

    So the first thing you need to do is find out where the problems are. For that you can do:
    Code:
    duplicates tag idhh relationship, gen(flag)
    browse if flag
    That way you will see them. You will then have to fix these problems. If you have two sons both designated Hijo3 in the same family, then one of them needs to be re-designated as Hijo4 (assuming Hijo4 is not already taken--if it is, then Hijo# for some larger # not already taken in that hhid.) If you have two daughters-in-law, you will need to change the designation from Nuera to Nuera1 and Nuera2. (And, if you have to do this, then you should replace Nuera by Nuera1 in the entire data set, and similarly for Yerno and Padres.) If you find more than one Jefe(a) in a single idhh, that is almost surely a data error and you will have to figure out which one is the real Jefe(a) and what the real relationship of the other is.

    If you have difficulty applying this advice, please post back with example data. Choose a subset of the data that demonstrates this problem, and use the -dataex- command to show that subset. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Thank you very much Clyde!!

      Comment

      Working...
      X