Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtset problem: repeated time values within panel

    Hi there!
    I am having trouble trying to xtset my dta file in stata. My data looks something like this:
    Panel_id web_id year other_variables,,,
    4 123 2006 5
    4 245 2010 6
    6 354 2006 9
    6 685 2010 10
    . 657 2010 10
    . 654 2006 9

    I have been trying to find a solution to this for the past 3 days and I cannot seem to get it. What am I doing wrong? I know the panel_id is a duplicate and that is where the problem lies. But, how to eliminate the duplicate panel_ id, and what do I do for those missing panel_id that have a another web_id. For clarification, panel_id tell me whether the same firm was surveyed in 2006 and 2010. The web_id does not tell me that.
    Thanks for your help!!

  • #2
    Hi,
    First of all, Please remember to sign up with your name. Its part of the protocol here at the Stata list Forum (see the FAQ).
    Second, to decide what to do with the missing panel_id observations, you need to know why are they missing. What will happen if you simply drop them. What does Web_id is provided.
    Also look into duplicates.
    HTH
    Fernando

    Comment


    • #3
      Hi Fernando. I was not aware of signing in with my name. I will try to fix it.
      The panel id exist for only those variables that were surveyed both in 2006 and 2010. That is the only way, I can tell whether the same firm was surveyed in both years. The web_id is a unique id for all firms and differs from year 2006 and year 2010. In other words , the web_ id for Firm A in year 2006 is 1112 and in year 2010 6577. So, I cannot distinguish whether is the same firm or not. So, how can i set the xtset now?
      Thanks.

      Comment


      • #4
        You don't show your xtset command, but I assume you are doing xtset Panel_id, when you should probably be doing xtset Panel_id year. I assume you don't really want to actually delete observations with duplicate Panel IDs.

        As for the missing Panel IDs, there are certainly ways to provide Panel IDs that can be used as place holders, but you would need to provide more details about how many missing values you have, the current range of Panel IDs, and how you plan to tell whether two observations with missing Panel IDs are from the same firm.

        Also, I don't usually mention this, but "stata_question" is not a very useful user ID. As mentioned in the FAQ, here at statalist.org, we traditionally prefer that users provide a real name, which provides for a more friendly and professional atmosphere. See if you can get yours changed.

        Comment


        • #5
          Yes, I apologize about the name. I was not aware that this was a strict policy here. I will need to create a new account to change it and will do so for the next post.
          I am trying to use xtset panel_id year. But I get the same error message.
          For this particular dataset, I have 600 missing panel_ids out of 2100 observations. The missing panel id's indicate that they were only survyed once. Hence, all of them are unique.

          Comment


          • #6
            In that case, i would simply go with:
            drop if panel_id==.
            sort panel_id year
            duplicates list panel_id year,
            If there are no duplicates, then:
            xtset panel_id year

            Comment


            • #7
              I agree with Fernando if you want to get rid of firms that only have one year's worth of data. However, if you want to keep them in your analysis, I would suggest something like the following:

              Code:
              bysort panel_id: replace panel_id=3000+_n if mi(panel_id)
              This will give a unique Panel ID for each observation that has a missing Panel ID. You can change the "3000" in the above code to whatever you want such that the number is greater than or equal to the highest Panel ID in your data set. If you want the numbers to be consecutive, change the "3000" to the highest Panel ID; the next assigned ID will be the highest plus 1.

              Comment


              • #8
                This does work. Thanks. I have another question though : If I wanted to randomly assign numbers to each of the blank panel_ids. How can I do that?

                Comment


                • #9
                  Okay. I got the answer! Thanks Joe.

                  Comment


                  • #10
                    Just use the "Contact us" button on the home page and send a full real name to the administrators.

                    Comment


                    • #11
                      Hi, I have been working with NSSO household level data (India) and on trying to set up a panel, I am facing the similar error message as 'repeated time values in panel'.
                      The variables that I have in my dataset are:
                      state sector itemcode totalquantityconsumed householdid personserialnumber personid monthlypercapitaconsumptionexpenditure year weight

                      The hhid( household id) is a stringed variable and I have destringed it using the command 'gen hhid_code = real(hhid)' and then since it was in scientific notation, used the command
                      'format hhid %9.0f'
                      I have checked for all duplicates and there are none now. Yet, when I type 'xtset state year', I repeatedly get the same error message. please can someone suggest a solution!!!

                      Comment


                      • #12
                        https://www.stata.com/support/faqs/d...d-time-values/

                        Comment


                        • #13
                          Meghna:
                          welcome to this forum.
                          If you do not plan to use time series-related commands, you can simply type:
                          Code:
                          xtset state
                          Kind regards,
                          Carlo
                          (Stata 19.0)

                          Comment


                          • #14
                            Thanks!!
                            However, I have tried every solution here but nothing is working I am getting an error message on typing the assertion command and also when I am checking for duplicates, it shows that there are no duplicates!!

                            Comment


                            • #15
                              Originally posted by Carlo Lazzaro View Post
                              Meghna:
                              welcome to this forum.
                              If you do not plan to use time series-related commands, you can simply type:
                              Code:
                              xtset state
                              Thanks Carlo!!!

                              But will this be correct if I want to find out the change in effect of taxes on consumption between the two years, specifically between 2004 and 2011? I think I do need the time variable.

                              Comment

                              Working...
                              X