Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate new variable as difference between two observations

    Hello,

    I am working on a panel-data and I wish to estimate the regressions on a period of 1995-2015 changes (long run trends). I have tried to generate a new variable displaying the change from 1995-2015 in (log) hours worked (HEMPE) using the following command:

    gen diffHEMPE = ln_HEMPE - ln_HEMPE[_n-20]

    which indeed displays the difference in changes between 1995 and 2015, but also displays values for 1995, even when I exclude intermediate years. I want it to only display the difference either in year 1995 or in year 2015, any recommendations?

    My panel is long- format.


    Elena
    Last edited by Elena Cag; 23 Nov 2017, 03:19.

  • #2
    Welcome to Statalist.

    Presumably your data has a panel identifier, let us pretend it is called PanelID, and a year identifier, let us call it Year. Then you want to tell the genrate command to treat each panel separately, otherwise it will subtract data from the previous panel from observations in the current panel, which is why you are getting values for 1995. So something like the following might solve your problem.
    Code:
    by PanelID (Year), sort: gen diffHEMPE = ln_HEMPE - ln_HEMPE[_n-20]
    This will fail if you have missing years, it simply goes back 20 observations. Since you have panel data, something like the following, which takes advantage of the lag operator, will handle missing years.
    Code:
    xtset PanelID Year
    by PanelID (Year), sort: gen diffHEMPE = ln_HEMPE - L20.ln_HEMPE
    Now, a word of advice to a new member. Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question.

    The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

    I am concerned that I did not correctly understand your data - how many years there are, nor do I understand why you say you want to display the difference in 1995 - do you have data going back to 1975?

    I will also comment that if you are unfamiliar with the by prefix, you are likely new to Stata and would benefit from a review of the most important parts of the documentation. When I began using Stata in a serious way, I started as have others here, by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. All of these manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through Stata's Help menu. The objective in doing this was not so much to master Stata as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax, and know how to find out more about them in the help files and manual.

    Stata supplies exceptionally good documentation that amply repays the time spent studying it - there's just a lot of it. The path I followed surfaces the things you need to know to get started in a hurry and to work effectively.

    Comment


    • #3
      Excellent advice from William as always. Note further that once you have tsset or xtset the data, then


      Code:
       
       gen diffHEMPE = ln_HEMPE - L20.ln_HEMPE
      is all that you need. In fact you could also use D.

      Comment


      • #4
        Thank you Nick, my confusion about the statement of the problem led me to add the lag operator approach at the last minute, without thinking carefully.

        To Elena I would suggest especially looking at the full documentation for the xtset command which has examples of the good things that xtset makes possible. This is found in the Stata Longitudinal-Data/Panel-Data PDF reference manual,

        Comment


        • #5
          Thank you so much for all answers. You are correct in saying that I am fairly new to stata, and especially to this forum. I have tsset the data using a panelid and year, and tried to review some of the previous FAQs but I guess I did not fully understand the logic behind the command I posted above, before now. Thank you again for the advice, I will look up the documentation suggested.

          Comment


          • #6
            Hi, I am using panel data and am trying to generate a variable that is simply the first difference of another variable. My panel variables are "country" and "year". I have tried everything I saw in this forum but I keep getting "60 missing values generated".

            I am using:
            xtset country year
            gen DD = D.variable1

            I have also tried: gen DD = variable1-L1.variable1

            I get the same answer only generating missing values. Does anyone know what the problem is?

            Thank you very much!

            Joan



            Comment


            • #7
              I presume that you have 60 countries. When you first difference you automatically lose one year for each country since for the first year you cannot calculate the first difference.

              Comment


              • #8
                Thank you Eric. Actually I have 12 countries in 5 times periods - the weird thing is that a new variable is created with only missing values. Do you know why this could be?

                Comment


                • #9
                  Are you sure that your data have been saved as numeric and not as text (strings)? You should post the data for a variable for one country at least

                  Comment


                  • #10
                    I managed to solve this by using:

                    bysort country: gen DD=variable1-variable1[_n-1]

                    Thank you!

                    Comment

                    Working...
                    X