Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help calculating an "isomorphism index"

    Dear Stata users,

    I am currently working on a minor research paper about isomorphic behavior within municipality expenditure policies. More specifically, I have a panel dataset consisting of 94 municipalities across a 13 year time period (2007-2020). However, I have some questions on the practical implementation in Stata 17. I apologize beforehand if this seems like an odd question or request but here goes:

    1) In order to better assess the strength of the isomorphic trend in each municipality in each year I would like to calculate a specific isomorphic index. The purpose of this index is to assess more precisely whether the development of a municipality's expenditure is linked to its location in relation to the mean of the expenditure area. Therefore, I would like to create a large "matrix" in which each municipality's expenditure in one expenditure area in one year constitutes one observation. In each of the years, the deviation of each municipality from the mean should be calculated.
    The index itself should be calculated by subtracting the numerical deviation from the average in one year from the numerical deviation in the following year as follows:

    Isomorphic index = [Percentage deviation from average in T1] - [Percentage deviation from average in T0])
    Example of the how I wish to calculate the isomorphism index:
    Percentage wise deviation from the mean in year T0 Percentage wise deviation from the mean in year T1 Isomorphic Index
    5 2 3
    5 -2 3
    -5 -2 3
    2 6 -4
    2 -2 0
    This should give me a measure of how many percentage points the municipality has moved closer to the average from one year to the next, e.g. the higher the value of the index, the higher the degree of isomorphism. The isomorphism index should be dynamic in the sense that a municipality's movement in relation to the average depends not only on what the municipality itself does, but also on what other municipalities do. To move closer to the average, a municipality must therefore move more than the other municipalities. I use this as an expression of convergence, if a municipality moves towards the mean in a given year. I have a total of 6 different expenditure areas (accounting figures for school expenditure, childcare, elderly care etc.) for all of the municipalities in all of the different years and my panel dataset is balanced. I want to calculate a mean index where all the expenditure areas have equal weights and a weighted index to compare the two.

    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int(cpr aar) double school_expenditure float(ctry_mean pdeviation)
    101 2007 66509.69752440725 64274.83 -.03360216
    101 2008 70869.81830633253 67353.82 -.04961206
    101 2009 72345.74299312533 72259.81 -.0011877754
    101 2010 70194.77140639097 73764.68 .05085718
    101 2011 70324.28533151961 72146.484 .025911376
    101 2012 75360.49502744527 72941.266 -.03210209
    101 2013 75669.62569599938 72252.625 -.04515683
    101 2014 76987.9445914878 75476.22 -.019635877
    101 2015 77502.98311444653 76465.36 -.013388178
    101 2016 81778.32377333674 76183.74 -.068411544
    101 2017 85640.68849259161 77485.125 -.09523
    101 2018 86419.13629573012 79127.68 -.08437317
    101 2019 86198.27086388595 80207.6 -.06949872
    101 2020 89199.04343286347 82237.66 -.0780432
    147 2007 69528.28230409963 64274.83 -.07555852
    147 2008 70319.89449886963 67353.82 -.04217973
    147 2009 74460.10020774778 72259.81 -.02954989
    147 2010 75566.59060886274 73764.68 -.02384534
    147 2011 74335.24254821743 72146.484 -.02944442
    147 2012 75695.24247299471 72941.266 -.03638243
    147 2013 75170.87053571429 72252.625 -.0388215
    147 2014 77392.20133938215 75476.22 -.02475679
    147 2015 78291.52294345145 76465.36 -.023325177
    147 2016 77744.02592142568 76183.74 -.0200695
    147 2017 79440.29924205138 77485.125 -.02461187
    147 2018 82041.75651832974 79127.68 -.035519436
    147 2019 81575.88805166846 80207.6 -.016773174
    147 2020 83152.56059421424 82237.66 -.011002626
    151 2007 77513.27048585932 64274.83 -.17078936
    151 2008 83693.55075657411 67353.82 -.19523285
    151 2009 90862.8914252268 72259.81 -.2047379
    151 2010 94229.14528661486 73764.68 -.21717766
    151 2011 88590.13479484522 72146.484 -.1856149
    151 2012 85640.03577284246 72941.266 -.14828077
    151 2013 83876.9551616267 72252.625 -.13858789
    151 2014 88541.3870246085 75476.22 -.14756002
    151 2015 91246.69734303102 76465.36 -.16199313
    151 2016 90822.03262984584 76183.74 -.16117555
    151 2017 93704.96518316682 77485.125 -.1730948
    151 2018 96919.20590951062 79127.68 -.1835707
    151 2019 99772.82508453734 80207.6 -.1960977
    151 2020 104436.32987776575 82237.66 -.21255693
    153 2007 76729.36457673633 64274.83 -.16231774
    153 2008 83501.28976784178 67353.82 -.1933799
    153 2009 88241.62102957284 72259.81 -.1811142
    153 2010 88695.66179725542 73764.68 -.1683395
    153 2011 84577.22308892357 72146.484 -.14697501
    153 2012 86508.51305334846 72941.266 -.15683135
    153 2013 88206.21856424326 72252.625 -.180867
    153 2014 89856.29965947787 75476.22 -.1600342
    153 2015 89868.63747499444 76465.36 -.149143
    153 2016 89976.84257403697 76183.74 -.1532961
    153 2017 92804.07980273482 77485.125 -.1650677
    153 2018 94258.44707988364 79127.68 -.16052426
    153 2019 96452.9188101096 80207.6 -.16842744
    153 2020 100122.84241201525 82237.66 -.17863235
    155 2007 73625.67634038367 64274.83 -.12700526
    155 2008 75256.14552605702 67353.82 -.10500571
    155 2009 78180.42511122096 72259.81 -.07573012
    155 2010 75700.19342359768 73764.68 -.02556815
    155 2011 74312.77533039647 72146.484 -.02915099
    155 2012 76741.68297455968 72941.266 -.04952221
    155 2013 76529.64042759962 72252.625 -.05588704
    155 2014 82145.01216545013 75476.22 -.08118318
    155 2015 80828.53025936599 76465.36 -.05398058
    155 2016 79997.63816721775 76183.74 -.04767511
    155 2017 79608.10174281677 77485.125 -.02666785
    155 2018 83616.76082862524 79127.68 -.05368638
    155 2019 83438.7706855792 80207.6 -.03872503
    155 2020 81056.68202764977 82237.66 .01456983
    157 2007 59573.841319717205 64274.83 .07891025
    157 2008 61475.20701412567 67353.82 .09562576
    157 2009 64435.72860005726 72259.81 .12142462
    157 2010 65110.76487252124 73764.68 .13291065
    157 2011 64553.4761860204 72146.484 .11762354
    157 2012 65631.52507676561 72941.266 .11137545
    157 2013 66142.8968327468 72252.625 .09237164
    157 2014 67200.72562358277 75476.22 .12314589
    157 2015 67218.22110190854 76465.36 .13756892
    157 2016 68613.7358220952 76183.74 .11032785
    157 2017 68416.58536585367 77485.125 .13254885
    157 2018 70296.50597080938 79127.68 .12562749
    157 2019 71512.20374955784 80207.6 .1215932
    157 2020 74838.83078573343 82237.66 .09886356
    159 2007 78532.49714937285 64274.83 -.1815512
    159 2008 79686.43010996486 67353.82 -.15476424
    159 2009 85927.95421389294 72259.81 -.15906514
    159 2010 82930.81550802139 73764.68 -.1105275
    159 2011 80190.62986868463 72146.484 -.10031278
    159 2012 81733.48214285714 72941.266 -.10757178
    159 2013 78079.19030141253 72252.625 -.0746238
    159 2014 81431.21693121694 75476.22 -.07312918
    159 2015 82279.77929243752 76465.36 -.07066645
    159 2016 80459.12689920822 76183.74 -.05313735
    159 2017 81564.30612892982 77485.125 -.05001184
    159 2018 84010.8603964572 79127.68 -.05812559
    159 2019 88752.98804780877 80207.6 -.0962828
    159 2020 89278.94408646852 82237.66 -.07886831
    161 2007 80300.8641005499 64274.83 -.1995749
    161 2008 86252.47328848437 67353.82 -.21910854
    end
    [/CODE]


    My question is: Is there any sophisticated way to calculate such an index using Stata?

    I can provide more examples if needed. I appreciate any help and feedback that I can get.

    So far I have just tried calculating the deviation from the mean for one area (public schools) by doing the following:

    Code:
    bys year id: egen ctry_mean = mean(school_expenditure)
    
    gen pdeviation = (ctry_mean - school_expenditure)/school_expenditure
    However, I am unsure how to proceed from here in order to calculate my index as mentioned above. I can provide more examples if needed. I appreciate any help and feedback that I can get.

    Thanks in advance,

    /Keith
    Last edited by Keith Richardson; 16 Apr 2022, 21:07.

  • #2
    There are some unclear points in the question:

    1) In your code you have year and id, but they are not in the sample data. And it does not look like cpr is id, as that would not agree with the egen results.
    2) I'm very confused by the index table:
    Percentage wise deviation from the mean in year T0 Percentage wise deviation from the mean in year T1 Isomorphic Index
    5 2 3
    5 -2 3
    -5 -2 3
    2 6 -4
    2 -2 0
    First the question's formula says T1 - T0. But from the first line it seems you used T0 - T1. Assuming that is T0 - T1. It's unclear how lines 2 can be 3, given 5 - (-2) should be 7. Similarly, line 3 should be -3, and line 5 should be 4. Or did I miss something?

    Regardless, computing that running difference can be achieved like the followings. I'm using T1 - T0, and moving the window on row by row. If it should be done differently then you'd need to clearly state how T1 and T0 is projected to a longer time series.

    Code:
    bysort cpr (aar): gen index = pdeviation - pdeviation[_n-1]

    Comment


    • #3
      Thanks for your comment! I have realized that I forgot to change the variable names in the data example hence the confusion. But "cpr" is indeed id and "aar" is year as you correctly assume. I made the index table to illustrate how I wish to calculate the index but indeed the first column should have been T1 and the second T0. Regarding the index, my thought was that the index values will have a positive sign regardless of whether the municipality is moving towards the mean from above or below if that makes any sense. I have based the calculation of the index on an old study from which I am trying to replicate the method to make the results comparable.

      Nevertheless, thanks for your help and once again I apologize for the confusion! I will remember to read my post thoroughly before posting next time. For now, I will try to use your suggested code and see if I can manage to calculate the index for all my expenditure variables.

      Your help is much appreciated.

      /Keith

      Comment

      Working...
      X