Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data Analysis - Breaks into 5 and 10 years period, random/fixed effects

    Hi,

    It is my first time on this forum, I will try to be as concise and precise as possible.

    I am trying to estimate the relationship between inequality and growth in a panel data of 22 countries for the period from 1985 to 2010, using fixed-effect and random effect. I want to use 5-years and 10-years period to do the estimation (to avoid endogeneity problems). The model is specified as follows:

    (yit+j - yit)/a = BXit-1 + uit

    Where:

    yit is the logarithm of the GNP per capita in country i at time t (the lefthand side is the growth rate);
    a is the length of time chosen to break the panel periods (as such if I chose to take 5-years period breaks, I will have 5 periods of 5 years: 1986-1990, 1991-1995, 1996-2000, 2001-2005 and 2006-2010).
    Xit-1 is the set of control variables whose values belong to the previous time period with the value chosen as close as possible to the year at the beginning of the period. (for instance, if I estimate the influence on the second period GNP growth rate from 1991 to 1995, my independent variables would need to be measured in 1986-1990, as close as possible to 1990).
    uit is the time varying error term.

    As such, I want to know how to estimate such a model in Stata and how my data should be organized beforehand. More precisely, I am aware of the panel data commands to do fixed effects and random effects, but I would like to know how to insert 5 years and 10-years breaks in the data.

    Thank you in advance for your assistance. Let me know if this isn't clear enough,

    ​Catherine

  • #2
    Catherine:
    I'm not clear with the endogeneity problem you came across (nothing that you can manage with -xtivreg-?).
    Anyway, you may want to take a look at -[U] 11.1.3 if exp- entry in Stata 13.1 .pdf manual.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      For the record, this was cross-posted on http://stackoverflow.com/questions/2...10-years-break My advice there was to post to Statalist. Please note our policy on cross-posting, which is that you should tell us about it.

      Comment


      • #4
        I have some ideas that may help, but the structure of the data is unclear to me.

        GDP being what it is, I assume you have complete GDP data (no missing years). But your wording regarding the control variables ("whose values belong to the previous time period with the value chosen as close as possible to the year at the beginning of the period") leads me to believe that some of your control variables are missing for some years. In that case, your intent is that if, for example, you are looking at the period 1991-1995, you want the control variables as measured in 1994, or in 1993 for those for which 1994 is not available, and so on backwards in time.

        Is that correct? Or have I made it needlessly complex?

        Comment


        • #5
          Hi Carlos, Thank you a lot for the answer. I will look at this function, which I believe could be of great help if I add a period vector to align my data. I could then run the regression using lagged periods.

          Comment


          • #6
            Thanks for the advices Nick, sorry again it is my first time and I am still learning how these forums work. I take great note of your policy for the next time.

            Comment


            • #7
              Hi William,

              If this help, this is a structure of my data. You are exactly on the point, for most variables I have complete data (a value for each year), except for the Gini which can be more sparse. My regression should be on the GDP growth rate (as a regressant) using several independent variables (GDP per capita, Urbanization rate, Gini, education enrollment rates, etc.). The regressions should be specified such that the average GDP growth rate from period 2 is regressed on average values of the independent variables from the previous period (period 1). I want to apply different estimation methods such as FE, RE and GMM estimation using the same data set. I am currently trying estimation using simple panel regression analysis, and using if statements but I am not sure if this will exactly lead to the expected results...
              Country Year Period GDP growth rate Income (GDP per capita) Urbanization GINI
              Argentina 1985
              Argentina 1986 1
              Argentina 1987 1
              Argentina 1988 1
              Argentina 1989 1
              Argentina 1990 1
              Argentina 1991 2
              Argentina 1992 2
              Argentina 1993 2
              Argentina 1994 2
              Argentina 1995 2
              Argentina 1996 3
              Argentina 1997 3
              Argentina 1998 3
              Argentina 1999 3
              Argentina 2000 3
              Argentina 2001 4
              Argentina 2002 4
              Argentina 2003 4
              Argentina 2004 4
              Argentina 2005 4
              Argentina 2006 5
              Argentina 2007 5
              Argentina 2008 5
              Argentina 2009 5
              Argentina 2010 5
              Brasil 1985

              Comment


              • #8
                Following this message.

                Originally posted by William Lisowski View Post
                I have some ideas that may help, but the structure of the data is unclear to me.

                GDP being what it is, I assume you have complete GDP data (no missing years). But your wording regarding the control variables ("whose values belong to the previous time period with the value chosen as close as possible to the year at the beginning of the period") leads me to believe that some of your control variables are missing for some years. In that case, your intent is that if, for example, you are looking at the period 1991-1995, you want the control variables as measured in 1994, or in 1993 for those for which 1994 is not available, and so on backwards in time.

                Is that correct? Or have I made it needlessly complex?

                Comment


                • #9
                  The code block at the bottom demonstrates how I would organize your data for analysis, as I displayed in the initial code block. Copy and paste the code into the do-file editor and run it to see the complete process. Some specific comments follow.
                  1. I started with the layout you showed and made up data to demonstrate two countries over the period 1985-2010. I show complete data for country, year, gdp, and income, and some missing values for gini.
                  2. Stata wants numeric data, so the country is stored a numeric codes with a value label to display the code as the country name. If your data comes to you with country names, Stata can automate this translation to numeric with the encode command.
                  3. I use the xtset command to define the data as panel data, which helps with both the variable creation and the subsequent analysis you will do.
                  4. I fill in missing gini values using the value from the previous year, which in turn may have been copied from the year before that, making the point that Stata works its way through the data one observation at a time, first to last. The lag and forward operators are discussed in help tsvarlist .
                  5. I calculate separate growth rates over the forward 5-year and 10-year periods. The use of xtset automatically prevents the forward averaging for country 1 from using the initial values from country 2. As a new user Stata from a background in other languages, I love this.
                  6. Finally, the two list commands demonstrate how I would select the data to be used in building the model. Replace list with the modeling command of your choice.
                  That's where I would start, and these are the tools I would use to get there. Hope you find this helpful.
                  Code:
                    +-----------------------------------------------+
                    | country   year        gr5   income   gini_f~d |
                    |-----------------------------------------------|
                    | Eriador   1985   .1464738     5.05       .524 |
                    | Eriador   1990   .2879671      5.1       .504 |
                    | Eriador   1995   .3549908      5.2       .485 |
                    | Eriador   2000    .072929     5.33       .511 |
                    | Eriador   2005   .3854327     5.35       .506 |
                    |-----------------------------------------------|
                    |  Mordor   1985   .3270213     5.05       .513 |
                    |  Mordor   1990   .4268336      5.1       .491 |
                    |  Mordor   1995   .4273061     5.22       .489 |
                    |  Mordor   2000   .4636911     5.31       .482 |
                    |  Mordor   2005   .4200941     5.43       .503 |
                    +-----------------------------------------------+
                  
                    +-----------------------------------------------+
                    | country   year       gr10   income   gini_f~d |
                    |-----------------------------------------------|
                    | Eriador   1985   .2024192     5.05       .524 |
                    | Eriador   1995   .2060514      5.2       .485 |
                    |-----------------------------------------------|
                    |  Mordor   1985   .2797283     5.05       .513 |
                    |  Mordor   1995    .296424     5.22       .489 |
                    +-----------------------------------------------+
                  Code:
                  clear
                  input country    year    gdp    income    gini
                  1    1985    211.99    5.05    0.524
                  1    1986    213.16    5.07    0.499
                  1    1987    214.18    5.09    0.503
                  1    1988    214.88    5.10    .
                  1    1989    215.04    5.10    0.487
                  1    1990    215.24    5.10    0.504
                  1    1991    216.51    5.12    0.522
                  1    1992    216.64    5.12    .
                  1    1993    218.33    5.16    0.516
                  1    1994    218.98    5.17    0.518
                  1    1995    220.73    5.20    0.485
                  1    1996    221.54    5.22    .
                  1    1997    223.70    5.26    0.498
                  1    1998    225.68    5.30    0.499
                  1    1999    226.41    5.32    0.511
                  1    2000    227.44    5.33    .
                  1    2001    227.95    5.34    0.513
                  1    2002    228.78    5.36    0.478
                  1    2003    228.89    5.35    0.485
                  1    2004    228.96    5.35    .
                  1    2005    229.39    5.35    0.506
                  1    2006    231.46    5.40    0.486
                  1    2007    232.88    5.42    0.516
                  1    2008    234.59    5.46    .
                  1    2009    236.62    5.50    0.490
                  1    2010    238.33    5.53    0.504
                  2    1985    379.70    5.05    0.513
                  2    1986    380.49    5.05    0.514
                  2    1987    380.60    5.05    .
                  2    1988    381.83    5.06    .
                  2    1989    384.25    5.09    .
                  2    1990    385.62    5.10    0.491
                  2    1991    388.44    5.13    .
                  2    1992    392.17    5.18    0.494
                  2    1993    392.56    5.18    .
                  2    1994    393.24    5.18    0.489
                  2    1995    396.89    5.22    .
                  2    1996    397.17    5.22    0.524
                  2    1997    399.86    5.25    .
                  2    1998    401.80    5.27    0.505
                  2    1999    403.69    5.29    .
                  2    2000    405.64    5.31    0.482
                  2    2001    406.39    5.32    .
                  2    2002    409.75    5.36    0.517
                  2    2003    411.59    5.37    .
                  2    2004    414.88    5.41    0.503
                  2    2005    416.55    5.43    .
                  2    2006    418.93    5.45    0.507
                  2    2007    421.53    5.48    .
                  2    2008    425.60    5.53    0.517
                  2    2009    426.37    5.53    .
                  2    2010    427.10    5.54    0.507
                  end
                  label define cname 1 "Eriador" 2 "Mordor"
                  label values country cname
                  sort country year
                  xtset country year
                  
                  // copy forward to fill in missing gini
                  
                  clonevar gini_filled = gini
                  replace gini_filled = L.gini_filled if missing(gini_filled)
                  
                  // compute 5- and 10-year forward growth rates
                  
                  by country : generate gr5  = ln(F5.gdp-F.gdp)/5
                  by country : generate gr10 = ln(F10.gdp-F.gdp)/10
                  
                  // list all the observations
                  
                  list , noobs clean
                  
                  // list the data used for 5-year model
                  
                  list country year gr5 income gini_filled ///
                      if inlist(year, 1985, 1990, 1995, 2000, 2005), ///
                      noobs sepby(country)
                  
                  // list the data used for 10-year model
                  
                  list country year gr10 income gini_filled ///
                      if inlist(year, 1985, 1995), ///
                      noobs sepby(country)

                  Comment


                  • #10
                    Thank you a lot William, it does indeed make sense. I will rearrange my data, and look into the suggested commands. I really appreciate your help!

                    Best,

                    Catherine


                    Originally posted by William Lisowski View Post
                    The code block at the bottom demonstrates how I would organize your data for analysis, as I displayed in the initial code block. Copy and paste the code into the do-file editor and run it to see the complete process. Some specific comments follow.
                    1. I started with the layout you showed and made up data to demonstrate two countries over the period 1985-2010. I show complete data for country, year, gdp, and income, and some missing values for gini.
                    2. Stata wants numeric data, so the country is stored a numeric codes with a value label to display the code as the country name. If your data comes to you with country names, Stata can automate this translation to numeric with the encode command.
                    3. I use the xtset command to define the data as panel data, which helps with both the variable creation and the subsequent analysis you will do.
                    4. I fill in missing gini values using the value from the previous year, which in turn may have been copied from the year before that, making the point that Stata works its way through the data one observation at a time, first to last. The lag and forward operators are discussed in help tsvarlist .
                    5. I calculate separate growth rates over the forward 5-year and 10-year periods. The use of xtset automatically prevents the forward averaging for country 1 from using the initial values from country 2. As a new user Stata from a background in other languages, I love this.
                    6. Finally, the two list commands demonstrate how I would select the data to be used in building the model. Replace list with the modeling command of your choice.
                    That's where I would start, and these are the tools I would use to get there. Hope you find this helpful.
                    Code:
                     +-----------------------------------------------+
                    | country year gr5 income gini_f~d |
                    |-----------------------------------------------|
                    | Eriador 1985 .1464738 5.05 .524 |
                    | Eriador 1990 .2879671 5.1 .504 |
                    | Eriador 1995 .3549908 5.2 .485 |
                    | Eriador 2000 .072929 5.33 .511 |
                    | Eriador 2005 .3854327 5.35 .506 |
                    |-----------------------------------------------|
                    | Mordor 1985 .3270213 5.05 .513 |
                    | Mordor 1990 .4268336 5.1 .491 |
                    | Mordor 1995 .4273061 5.22 .489 |
                    | Mordor 2000 .4636911 5.31 .482 |
                    | Mordor 2005 .4200941 5.43 .503 |
                    +-----------------------------------------------+
                    
                    +-----------------------------------------------+
                    | country year gr10 income gini_f~d |
                    |-----------------------------------------------|
                    | Eriador 1985 .2024192 5.05 .524 |
                    | Eriador 1995 .2060514 5.2 .485 |
                    |-----------------------------------------------|
                    | Mordor 1985 .2797283 5.05 .513 |
                    | Mordor 1995 .296424 5.22 .489 |
                    +-----------------------------------------------+
                    Code:
                    clear
                    input country year gdp income gini
                    1 1985 211.99 5.05 0.524
                    1 1986 213.16 5.07 0.499
                    1 1987 214.18 5.09 0.503
                    1 1988 214.88 5.10 .
                    1 1989 215.04 5.10 0.487
                    1 1990 215.24 5.10 0.504
                    1 1991 216.51 5.12 0.522
                    1 1992 216.64 5.12 .
                    1 1993 218.33 5.16 0.516
                    1 1994 218.98 5.17 0.518
                    1 1995 220.73 5.20 0.485
                    1 1996 221.54 5.22 .
                    1 1997 223.70 5.26 0.498
                    1 1998 225.68 5.30 0.499
                    1 1999 226.41 5.32 0.511
                    1 2000 227.44 5.33 .
                    1 2001 227.95 5.34 0.513
                    1 2002 228.78 5.36 0.478
                    1 2003 228.89 5.35 0.485
                    1 2004 228.96 5.35 .
                    1 2005 229.39 5.35 0.506
                    1 2006 231.46 5.40 0.486
                    1 2007 232.88 5.42 0.516
                    1 2008 234.59 5.46 .
                    1 2009 236.62 5.50 0.490
                    1 2010 238.33 5.53 0.504
                    2 1985 379.70 5.05 0.513
                    2 1986 380.49 5.05 0.514
                    2 1987 380.60 5.05 .
                    2 1988 381.83 5.06 .
                    2 1989 384.25 5.09 .
                    2 1990 385.62 5.10 0.491
                    2 1991 388.44 5.13 .
                    2 1992 392.17 5.18 0.494
                    2 1993 392.56 5.18 .
                    2 1994 393.24 5.18 0.489
                    2 1995 396.89 5.22 .
                    2 1996 397.17 5.22 0.524
                    2 1997 399.86 5.25 .
                    2 1998 401.80 5.27 0.505
                    2 1999 403.69 5.29 .
                    2 2000 405.64 5.31 0.482
                    2 2001 406.39 5.32 .
                    2 2002 409.75 5.36 0.517
                    2 2003 411.59 5.37 .
                    2 2004 414.88 5.41 0.503
                    2 2005 416.55 5.43 .
                    2 2006 418.93 5.45 0.507
                    2 2007 421.53 5.48 .
                    2 2008 425.60 5.53 0.517
                    2 2009 426.37 5.53 .
                    2 2010 427.10 5.54 0.507
                    end
                    label define cname 1 "Eriador" 2 "Mordor"
                    label values country cname
                    sort country year
                    xtset country year
                    
                    // copy forward to fill in missing gini
                    
                    clonevar gini_filled = gini
                    replace gini_filled = L.gini_filled if missing(gini_filled)
                    
                    // compute 5- and 10-year forward growth rates
                    
                    by country : generate gr5 = ln(F5.gdp-F.gdp)/5
                    by country : generate gr10 = ln(F10.gdp-F.gdp)/10
                    
                    // list all the observations
                    
                    list , noobs clean
                    
                    // list the data used for 5-year model
                    
                    list country year gr5 income gini_filled ///
                    if inlist(year, 1985, 1990, 1995, 2000, 2005), ///
                    noobs sepby(country)
                    
                    // list the data used for 10-year model
                    
                    list country year gr10 income gini_filled ///
                    if inlist(year, 1985, 1995), ///
                    noobs sepby(country)

                    Comment


                    • #11
                      On reflection, my calculation of the forward growth rates was in error. Growth during 1986-1990 is the difference between the 1990 value and the 1985 value, not the 1986 value as I had calculated. The corrected code is below.
                      Code:
                      // compute 5- and 10-year forward growth rates
                      
                      by country : generate gr5 = ln(F5.gdp-gdp)/5
                      by country : generate gr10 = ln(F10.gdp-gdp)/10

                      Comment


                      • #12
                        Originally posted by Nick Cox View Post
                        For the record, this was cross-posted on http://stackoverflow.com/questions/2...10-years-break My advice there was to post to Statalist. Please note our policy on cross-posting, which is that you should tell us about it.
                        This is exactly what i was looking for. But the page does not open

                        Comment


                        • #13
                          #12 is a copy of #3 written by me in 2015. The link is indeed dead to you, but with enough reputation (points total) on Stack Exchange it is possible to see the remains of the thread. I can tell you that you are missing nothing, as there was no reply to the question there, which was closed. So, 7 years later, cross-posting is still wasting (a little) time.

                          Comment

                          Working...
                          X