Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating a mean with time-series operators

    Hi everyone,
    I am a beginner in STATA and have problems with an issue. I am working with World Bank data, where I have Gini coefficients for different country-years.
    country year gini
    Albania 2008 30
    Albania 2009 .
    Albania 2010 .
    Albania 2011 .
    Albania 2012 29
    Albania 2013 .
    Albania 2014 34.6
    Albania 2015 32.8
    Albania 2016 33.7
    Albania 2017 33.1
    Albania 2018 30.1
    Albania 2019 30.1
    Albania 2020 29.4


    For each year, I would like to generate the mean of all the values in the timeframe between the previous five years and the following five years. For example, for 2013, I would like to generate the mean of the Gini coefficients for Albania between 2008 and 2018.

    I tried with the following code, but the problem is that the mean is not calculated as soon as there is a missing value in the timeframe. However, I would like missings to just be ignored.

    Code:
    by country: gen sum_gini = gini[_n-5] + gini[_n-4] + gini[_n-3] + gini[_n-2] + gini[_n-1] + gini + gini[_n+1] + gini[_n+2] + gini[_n+3] + gini[_n+4] + gini[_n+5]
    by country: gen count_nonmissing = !missing(gini[_n-5]) + !missing(gini[_n-4]) + !missing(gini[_n-3]) + !missing(gini[_n-2]) + !missing(gini[_n-1]) + !missing(gini) + !missing(gini[_n+1]) + !missing(gini[_n+2]) + !missing(gini[_n+3]) + !missing(gini[_n+4]) + !missing(gini[_n+5])
    by country: gen gini2 = sum_gini_ gini / count_nonmissing
    Thank you
    Anna

  • #2
    One way to approach this is using rangestat from SSC, which you would need to install first. I re-wrote your data example and got this from dataex.

    Code:
    clear
    input str7 country int year double gini
    "Albania" 2008   30
    "Albania" 2009    .
    "Albania" 2010    .
    "Albania" 2011    .
    "Albania" 2012   29
    "Albania" 2013    .
    "Albania" 2014 34.6
    "Albania" 2015 32.8
    "Albania" 2016 33.7
    "Albania" 2017 33.1
    "Albania" 2018 30.1
    "Albania" 2019 30.1
    "Albania" 2020 29.4
    end
    
    ssc install rangestat 
    
    rangestat (count) c_gini=gini (mean) mean_gini=gini, int(year -5 5) by(country) 
    
    list 
    
         +--------------------------------------------+
         | country   year   gini   c_gini   mean_gini |
         |--------------------------------------------|
      1. | Albania   2008     30        2        29.5 |
      2. | Albania   2009      .        3        31.2 |
      3. | Albania   2010      .        4        31.6 |
      4. | Albania   2011      .        5       32.02 |
      5. | Albania   2012     29        6        32.2 |
         |--------------------------------------------|
      6. | Albania   2013      .        7        31.9 |
      7. | Albania   2014   34.6        7   31.914286 |
      8. | Albania   2015   32.8        8        31.6 |
      9. | Albania   2016   33.7        8        31.6 |
     10. | Albania   2017   33.1        8        31.6 |
         |--------------------------------------------|
     11. | Albania   2018   30.1        7   31.971429 |
     12. | Albania   2019   30.1        7   31.971429 |
     13. | Albania   2020   29.4        6   31.533333 |
         +--------------------------------------------+

    Comment


    • #3
      It worked. Thousand thanks for the quick help!

      Comment

      Working...
      X