Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Gini coefficient by group and year

    Greetings,

    Hope everyone is keeping safe.

    I would like to compute the Gini coefficient for a large number of regions by year.

    I would like to generate a new variable and have tried the following loop;

    Code:
    foreach yr of numlist 1970/1972 {
        gen gini = . 
        qui ineqdeco income if year==`yr', by(group)  
        replace gini = $S_gini if year==`yr' 
    }
    However, I get the following error;

    Code:
    variable gini already defined
    Any help will be highly appreciated. Thank you and keep safe!

    Best,

    Chiara

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte group int(year income)
    1 1970 1200
    1 1970  720
    1 1970 2160
    2 1970 2400
    2 1970 1440
    2 1970 4800
    3 1970 3480
    3 1970 2616
    3 1970    0
    1 1971  360
    1 1971 1200
    1 1971  960
    2 1971 2760
    2 1971 3600
    2 1971    0
    3 1971 6000
    3 1971  960
    3 1971 3600
    1 1972  600
    1 1972 2472
    1 1972  600
    2 1972 1200
    2 1972 4944
    2 1972 7944
    3 1972 6000
    3 1972 2400
    3 1972    0
    end

  • #2
    You want to put the generate command outside the loop, otherwise it is executed for every iteration.

    Code:
    gen gini = .
    foreach yr of numlist 1970/1972 {
        qui ineqdeco income if year==`yr', by(group)
        replace gini = $S_gini if year==`yr'
    }

    Comment


    • #3
      Note that you can refer to "r(gini)" rather than "$S_gini". (The latter is left behind for the convenience of users of very very old versions of Stata.) To see the full list of saved results, type -return list- after using -ineqdeco-. (Ensure you have the latest version of that and its sibling -ineqdec0- too, both on SSC.)

      Comment


      • #4
        Thank you, Andrew Musau! My apologies for the mistake.

        Dear Professor Stephen Jenkins, thank you for the suggestion and of course your brilliant codes.

        The code runs but I get the following error/limitation;

        Code:
        too many values
        r(134);
        Is there anything that can be done without slicing the dataset? Thank you again.

        Best,

        Chiara

        Comment


        • #5
          You don't provide us with enough information to diagnose the problem. How many distinct values can the "group" variable take on? If it is "extremely large" then you may hit limits with -levelsof-, which my programs use. According to -help levelsof-

          levelsof may hit the limits imposed by your Stata. However, it is typically used when the
          number of distinct values of varname is not extremely large.
          And the limit might depend on the version of Stata you have? I don't know but am raising it as a possibility since there were some changes to -levelsof- during the lifetime of Stata 15.

          You could -set trace on- and run your code, and see where the error arises. There will be a huge amount of output so, if you post it, please post only a relevant extract.

          Comment


          • #6
            Thank you for your reply. The group variable has 6,500 distinct values.

            I think the following is the relevant extract from trace;

            Edit: I am using Stata 15MP

            Apologies if it is still too huge. Thank you again.

            Best,

            Chiara

            Code:
              - noi {
              - di "  "
              - tabdisp `bygroup' if `first' , c(`vk' `meanyk' `lambdak' `thetak' `lgmeank') f(%15.5f)
             = tabdisp group if __00001O , c(__00000B __00000D __00000F __00000I __00000H) f(%15.5f)
            too many values
                di "  "
                di as txt "Subgroup indices: GE_k(a) and Gini_k "
                tabdisp `bygroup' if `first' , c(`im1k' `i0k' `i1k' `i2k' `ginik') f(%9.5f)
                }
                capture levelsof `bygroup' if `touse' , local(group)
                qui if _rc levels `bygroup' if `touse' , local(group)
                return local levels "`group'"
                gsort -`first' `bygroup'
                local i = 1
                foreach k of local group {
                return scalar gem1_`k' = `im1k'[`i']
                return scalar ge0_`k' = `i0k'[`i']
                return scalar ge1_`k' = `i1k'[`i']
                return scalar ge2_`k' = `i2k'[`i']
                return scalar gini_`k' = `ginik'[`i']
                return scalar mean_`k' = `meanyk'[`i']
                return scalar lgmean_`k' = `lgmeank'[`i']
                return scalar theta_`k' = `thetak'[`i']
                return scalar lambda_`k' = `lambdak'[`i']
                return scalar v_`k' = `vk'[`i']
                return scalar sumw_`k' = `nk'[`i']
                local ++i
                }
                drop `lgmeank' `ginik' `thetak' `nk' `pyk'
                egen double `withm1' = sum(`fi' * `im1k' / `lambdak') if `touse'
                egen double `with0' = sum(`fi' * `i0k') if `touse'
                egen double `with1' = sum(`fi' * `i1k' * `lambdak') if `touse'
                egen double `with2' = sum(`fi' * `i2k' * `lambdak'^2) if `touse'
                lab var `withm1' "GE(-1)"
                lab var `with0' "GE(0)"
                lab var `with1' "GE(1)"
                lab var `with2' "GE(2)"
                noi {
                di "  "
                di as txt "Within-group inequality, GE_W(a)"
                tabdisp `touse' in 1 if `touse', c(`withm1' `with0' `with1' `with2') f(%9.5f)
                }
                return scalar within_gem1 = `withm1'[1]
                return scalar within_ge0 = `with0'[1]
                return scalar within_ge1 = `with1'[1]
                return scalar within_ge2 = `with2'[1]
                drop `im1k' `i0k' `i1k' `i2k' `withm1' `with0' `with1' `with2'
                egen double `im1b' = sum(`fi' * ((`meany' / `meanyk') - 1) / 2 ) if `touse'
                egen double `i0b' = sum(`fi' * log(`meany' / `meanyk')) if `touse'
                egen double `i1b' = sum(`fi' * (`meanyk' / `meany') * log(`meanyk' / `meany')) if `touse'
                egen double `i2b' = sum(`fi' * (((`meanyk' / `meany')^2) - 1) / 2) if `touse'
                lab var `im1b' "GE(-1)"
                lab var `i0b' "GE(0)"
                lab var `i1b' "GE(1)"
                lab var `i2b' "GE(2)"
                noi {
                di "              "
                di as txt "Between-group inequality, GE_B(a):"
                tabdisp `touse' in 1 if `touse' , c(`im1b' `i0b' `i1b' `i2b') f(%9.5f)
                }
                return scalar between_gem1 = `im1b'[1]
                return scalar between_ge0 = `i0b'[1]
                return scalar between_ge1 = `i1b'[1]
                return scalar between_ge2 = `i2b'[1]
                drop `im1b' `i0b' `i1b' `i2b'
                sort `notuse' `bygroup'
                by `notuse' `bygroup': egen double `edehalfk' = sum(`fik' * sqrt(`inc')) if `touse'
                replace `edehalfk' = (`edehalfk')^2 if `touse'
                gen double `ahalfk' = 1 - `edehalfk' / `meanyk' if `touse'
                by `notuse' `bygroup': egen double `ede1k' = sum(`fik' * log(`inc')) if `touse'
                replace `ede1k' = exp(`ede1k') if `touse'
                gen `a1k' = 1 - `ede1k' / `meanyk' if `touse'
                by `notuse' `bygroup': egen double `ede2k' = sum(`fik' / `inc') if `touse'
                replace `ede2k' = 1 / `ede2k' if `touse'
                gen double `a2k' = 1 - `ede2k' / `meanyk' if `touse'
                lab var `ahalfk' "A(0.5)"
                lab var `a1k' "A(1)"
                lab var `a2k' "A(2)"
                noi {
                di "              "
                di as txt "Subgroup Atkinson indices, A_k(e)"
                tabdisp `bygroup' if `first' , c(`ahalfk' `a1k' `a2k') f(%9.5f)
                }
                egen double `awithh' = sum(`fi' * `lambdak' * `ahalfk') if `touse'
                egen double `awith1' = sum(`fi' * `lambdak' * `a1k') if `touse'
                egen double `awith2' = sum(`fi' * `lambdak' * `a2k') if `touse'
                lab var `awithh' "A(0.5)"
                lab var `awith1' "A(1)"
                lab var `awith2' "A(2)"
                noi {
                di "  "
                di as txt "Within-group inequality, A_W(e)"
                tabdisp `touse' if `touse' , c(`awithh' `awith1' `awith2') f(%9.5f)
                }
                gsort -`first' `bygroup'
                local i = 1
                foreach k of local group {
                return scalar ahalf_`k' = `ahalfk'[`i']
                return scalar a1_`k' = `a1k'[`i']
                return scalar a2_`k' = `a2k'[`i']
                local ++i
                }
                return scalar within_ahalf = `awithh'[1]
                return scalar within_a1 = `awith1'[1]
                return scalar within_a2 = `awith2'[1]
                drop `ahalfk' `a1k' `a2k' `awithh' `awith1' `awith2' `lambdak'
                egen double `ahalfb' = sum(`fi' * `edehalfk' ) if `touse'
                replace `ahalfb' = 1 - `edehalf' / `ahalfb' if `touse'
                egen double `a1b' = sum(`fi' * `ede1k' ) if `touse'
                replace `a1b' = 1 - `ede1' / `a1b' if `touse'
                egen double `a2b' = sum(`fi' * `ede2k' ) if `touse'
                replace `a2b' = 1 - `ede2' / `a2b' if `touse'
                lab var `ahalfb' "A(0.5)"
                lab var `a1b' "A(1)"
                lab var `a2b' "A(2)"
                noi {
                di " "
                di as txt "Between-group inequality, A_B(e)"
                tabdisp `touse' in 1 if `touse' , c(`ahalfb' `a1b' `a2b') f(%9.5f)
                }
                return scalar between_ahalf = `ahalfb'[1]
                return scalar between_a1 = `a1b'[1]
                return scalar between_a2 = `a2b'[1]
                return scalar edehalf = `edehalf'[1]
                return scalar ede1 = `ede1'[1]
                return scalar ede2 = `ede2'[1]
                drop `ahalfb' `a1b' `a2b' `edehalf' `ede1' `ede2'
                if "`w'" == "w" {
                lab var `edehalfk' "Yede(0.5)"
                lab var `ede1k' "Yede(1)"
                lab var `ede2k' "Yede(2)"
                noi {
                di "              "
                di as txt "Subgroup equally-distributed-equivalent income, Yede_k(e)"
                tabdisp `bygroup' if `first', c(`edehalfk' `ede1k' `ede2k') f(%15.5f)
                }
                gsort -`first' `bygroup'
                local i = 1
                foreach k of local group {
                return scalar edehalf_`k' = `edehalfk'[`i']
                return scalar ede1_`k' = `ede1k'[`i']
                return scalar ede2_`k' = `ede2k'[`i']
                local ++i
                }
                drop `edehalfk' `ede1k' `ede2k'
                sort `notuse' `bygroup'
                by `notuse' `bygroup': egen double `whalfk' = sum(`fik' * sqrt(`inc') * 2) if `touse'
                by `notuse' `bygroup': egen double `w1k' = sum(`fik' * log(`inc') ) if `touse'
                by `notuse' `bygroup': egen double `w2k' = sum(-`fik' / `inc') if `touse'
                lab var `whalfk' "W(0.5)"
                lab var `w1k' "W(1)"
                lab var `w2k' "W(2)"
                noi {
                di "              "
                di as txt "Subgroup welfare indices: W_k(e) and Sen's index"
                tabdisp `bygroup' if `first', c(`whalfk' `w1k' `w2k' `wginik') f(%15.5f)
                }
                gsort -`first' `bygroup'
                local i = 1
                foreach k of local group {
                return scalar whalf_`k' = `whalfk'[`i']
                return scalar w1_`k' = `w1k'[`i']
                return scalar w2_`k' = `w2k'[`i']
                return scalar wgini_`k' = `wginik'[`i']
                local ++i
                }
                drop `whalfk' `w1k' `w2k'
                }
                drop `wginik' `fi'
                }
                }
              -------------------------------------------------------------------------------------------------------------------- end ineqdeco ---
              replace gini = r(gini) if year==`yr'
              }
            r(134);
             
            end of do-file
             
            r(134);
             
            .
            Last edited by Chiara Piazzo; 04 Apr 2020, 12:34.

            Comment


            • #7
              This is the solution I came across - using runby from Robert Picard. Is this correct?

              Code:
              gen gini = .
              program do_it
                  qui ineqdeco income
                  replace gini = r(gini)
              end
              
              runby do_it, by(group year) verbose
              Thanks and regards!

              Chiara

              Comment


              • #8
                If all you want to calculate is the Gini coeff for each group-year combination, then using -runby- is a fine solution. Things are trickier if you want to decompose the Gini coefficient into Within-group, Between-group and Overlapping components (cf -ineqdecgini- on SSC).
                Notice that -ineqdeco- falls over in your case because -tabdisp- has a limit on the number of rows it will produce, similar to the way that -tabulate- has limits on its ability to tabulate a variable with 'too many' categories. (Look at the manual entry for -tabdisp- and look for the discussion of limits.) 6,500 is a very large number of categories for tabulation programmes!

                Comment

                Working...
                X