Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Access r(sum) of Mata function

    Dear Stata users,

    I have rarely used Mata. I want to display r(sum) that returned by Mata function, but the display command gives nothing.

    Code:
    sysuse auto
    tab rep78, matcell(blended)
    mata:  b = st_matrix("blended")
    mata:  f = 100 * b :/ sum(b) 
    mata:  sum( abs(f :-10 ) )/2 
    display as txt "Myers Blended Index = " as result r(sum)
    Code:
    . tab rep78, matcell(blended)
    
         Repair |
    Record 1978 |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              1 |          2        2.90        2.90
              2 |          8       11.59       14.49
              3 |         30       43.48       57.97
              4 |         18       26.09       84.06
              5 |         11       15.94      100.00
    ------------+-----------------------------------
          Total |         69      100.00
    
    . 
    . mata:  b = st_matrix("blended")
    
    . mata:  f = 100 * b :/ sum(b) 
    
    . mata:  sum( abs(f :-10 ) )/2 
      32.10144928
    
    . 
    . display as txt "Myers Blended Index = " as result r(sum)
    Myers Blended Index = .

  • #2
    r(sum) is an r-class result that could be returned by any r-class Stata command that defines it. But summarize always returns r(sum) (although it never displays it).

    If you want to display in Stata a scalar calculated in Mata you need to pass it back to Stata, preferably as a numeric scalar.
    Last edited by Nick Cox; 19 Jan 2024, 19:54. Reason: Small correction

    Comment


    • #3
      Thank you very much Nick Cox. I certainly misunderstood the guide in [M-1] Ado -- Using Mata with ado-files.
      Code:
      program varsum
      version 16
      syntax varname [if] [in]
      marksample touse
      mata: calcsum("`varlist'", "`touse'")
      display as txt "  sum = " as res r(sum)
      end
      
      version 16
      mata:
      void calcsum(string scalar varname, string scalar touse)
      {
      real colvector  x
      
      st_view(x, ., varname, touse)
      st_numscalar("r(sum)", colsum(x))
      }
      end

      Comment


      • #4
        Indeed. r(sum) is a quantity in Stata that is pushed there by the Mata code you cite as a numeric scalar.

        colsum(x) is calculated in Mata.

        Its value is sent to Stata as r(sum).

        Your Mata code in #1 hasn't yet got that far.

        I've revised my wording slightly in #2.

        My question: Where does 10 come from in your equation? It looks as if you are comparing a set of percents with a reference situation of equal shares. If so, 10 should be 100 / 5 = 20.
        Last edited by Nick Cox; 19 Jan 2024, 20:00.

        Comment


        • #5
          Got it. It should be 10. The trouble is that you need to include zeros for the digits that don't occur, each of which contributes to the sum.

          Comment


          • #6
            Dear Nick, the code in #1 was borrowed from Joe Canner and modified. See https://www.stata.com/statalist/arch.../msg00154.html And Joe's original code is as follows:
            Code:
            gen lastdigit = mod(age,10)
            tab lastdigit [fw=pop]
            gen mw = 10
            replace mw = age+1 if age < 9
            replace mw = 99-age if age > 89
            replace mw = 0 if age > 98 
            gen combow = pop*mw
            tab lastdigit [fw=combow], matcell(blended)
            mata
                    b = st_matrix("blended")
                    f = 100 * b :/ sum(b)           
                    sum( abs(f :-10 ) )/2 
            end

            Comment


            • #7
              They were discussing about caculation of Myers Blended Index that was introduced to check age heaping. Unfortunately, the web they cited was shut down http://data.princeton.edu/eco572/digitpref.html

              Comment


              • #8
                This is only going to work correctly if all possible last digits 0(1)9 are present. But I guess that's likely if you have a large dataset with many people's ages.

                Comment


                • #9
                  Given digits 0(1)9 and a null expectation of equal frequency then sum(abs(percent of each digit - 10))/2 has an interpretation as the percent that would need to be changed to achieve equal frequencies. For example, if every digit is the same, and so the 9 other digits do not occur, then it is half the sum of one term |100 - 10| and 9 terms |0 - 10| or (90 + 90) / 2 = 90. I know this as the dissimilarity index but there are many other incarnations, going back to Gini if I recall correctly, (Occasionally, it seems that all indexes were invented by Gini but they aren't all the same index.)

                  I'm blithely assumed without rummaging through demographic literature that this is the Myers index too.

                  The key is that digits that do not occur must be included in the calculation.

                  This code is some ways more elaborate, and in some ways less, than anyone might need.

                  Code:
                  *! 1.0.0 NJC 21 Jan 2024 
                  program myers, rclass 
                      version 9 
                      syntax varname(numeric) [if] [in]
                      marksample touse 
                      
                      * integer argument? 
                      assert `varlist' == floor(`varlist') if `touse' 
                      
                      quietly count if `touse'
                      if r(N) == 0 error 2000 
                      
                      tempvar digits 
                      gen `digits' = mod(age, 10)
                      
                      mata : _myers("`digits'", "`touse'") 
                      di _n "Myers index: " %3.1f r(myers) 
                      
                      return scalar myers = r(myers)
                  end 
                  
                  mata : 
                  
                  void _myers(string scalar myvar, string scalar whichobs) {
                      real vector data, counts, percents  
                      real scalar i 
                      data = st_data(., myvar, whichobs)
                          
                      counts = sum(data :== 0)
                      for(i = 1; i <= 9; i++) { 
                          counts = counts \ sum(data :== i)
                      }
                      percents = 100 :* counts :/ sum(counts)
                      
                      " "
                      (0::9) , counts , percents 
                      
                      st_numscalar("r(myers)", sum(abs(percents :- 10))/2) 
                  }
                  
                  end 
                  
                  clear
                  set obs 1000 
                  set seed 2803 
                  gen age = runiformint(1, 99)
                  
                  myers age
                  
                  replace age = round(age,5) if runiform() < 0.1
                  replace age = round(age,10) if runiform() < 0.1  
                  
                  myers age
                  
                  replace age = round(age, 5)
                  
                  myers age
                  Code:
                  . clear
                  
                  . set obs 1000 
                  Number of observations (_N) was 0, now 1,000.
                  
                  . set seed 2803 
                  
                  . gen age = runiformint(1, 99)
                  
                  . myers age
                     
                             1      2      3
                       +----------------------+
                     1 |     0     90      9  |
                     2 |     1    107   10.7  |
                     3 |     2    112   11.2  |
                     4 |     3    109   10.9  |
                     5 |     4    103   10.3  |
                     6 |     5     96    9.6  |
                     7 |     6     86    8.6  |
                     8 |     7    107   10.7  |
                     9 |     8     86    8.6  |
                    10 |     9    104   10.4  |
                       +----------------------+
                  
                  Myers index: 4.2
                  
                  . replace age = round(age,5) if runiform() < 0.1
                  (76 real changes made)
                  . replace age = round(age,10) if runiform() < 0.1  
                  (94 real changes made)
                  
                  . myers age
                     
                             1      2      3
                       +----------------------+
                     1 |     0    224   22.4  |
                     2 |     1     90      9  |
                     3 |     2     88    8.8  |
                     4 |     3     87    8.7  |
                     5 |     4     82    8.2  |
                     6 |     5    121   12.1  |
                     7 |     6     67    6.7  |
                     8 |     7     86    8.6  |
                     9 |     8     67    6.7  |
                    10 |     9     88    8.8  |
                       +----------------------+
                  
                  Myers index: 14.5
                  
                  .. replace age = round(age, 5)
                  (655 real changes made)
                  
                  . 
                  . myers age 
                     
                             1      2      3
                       +----------------------+
                     1 |     0    557   55.7  |
                     2 |     1      0      0  |
                     3 |     2      0      0  |
                     4 |     3      0      0  |
                     5 |     4      0      0  |
                     6 |     5    443   44.3  |
                     7 |     6      0      0  |
                     8 |     7      0      0  |
                     9 |     8      0      0  |
                    10 |     9      0      0  |
                       +----------------------+
                  
                  Myers index: 80.0

                  Comment


                  • #10
                    The line

                    Code:
                    gen `digits' = mod(age, 10)
                    in #9 should probably be

                    Code:
                    gen `digits' = mod(`varlist', 10)
                    Also, a byte would suffice here but that is not relevant to functionality.

                    Comment


                    • #11
                      Good catch.

                      Comment


                      • #12
                        Code:
                        *! 1.0.1 NJC 21 Jan 2024 Daniel Klein fixes 
                        *! 1.0.0 NJC 21 Jan 2024 
                        program myers, rclass 
                            version 9 
                            syntax varname(numeric) [if] [in]
                            marksample touse 
                            
                            * integer argument? 
                            assert `varlist' == floor(`varlist') if `touse' 
                            
                            quietly count if `touse'
                            if r(N) == 0 error 2000 
                            
                            tempvar digits 
                            gen byte `digits' = mod(`varlist', 10)
                            
                            mata : _myers("`digits'", "`touse'") 
                            di _n "Myers index: " %3.1f r(myers) 
                            
                            return scalar myers = r(myers)
                        end 
                        
                        mata : 
                        
                        void _myers(string scalar myvar, string scalar whichobs) {
                            real vector data, counts, percents  
                            real scalar i 
                            data = st_data(., myvar, whichobs)
                                
                            counts = sum(data :== 0)
                            for(i = 1; i <= 9; i++) { 
                                counts = counts \ sum(data :== i)
                            }
                            percents = 100 :* counts :/ sum(counts)
                            
                            " "
                            (0::9) , counts , percents 
                            
                            st_numscalar("r(myers)", sum(abs(percents :- 10))/2) 
                        }
                        
                        end

                        Comment


                        • #13
                          Thank you so much Nick Cox and daniel klein. You are so kind and thoughtful.

                          Comment


                          • #14
                            I should add a supplement here. I found a myers.ado written by German Rodriguez at Princeton University. Hope this will be helpful to further the discussion of Myers Blended Index.

                            Code:
                            *! myers version 0.5 GR 2 Feb 2006 (Myers blended infex of digit preference)
                            program define myers, rclass
                            version 9
                                syntax varname [if] [in] [fweight] ///
                                    [, Range(numlist asc int min=2 max=2) months Generate(name)]
                                if ("`generate'" != "") {
                                    confirm new var `generate'
                                }
                                local age `varlist'
                                local unit = 10
                                if ("`months'" != "") {
                                    local unit = 12
                                }
                            
                                // range of values to be used
                                quietly sum `age' `if' `in'
                                local bot = r(min)
                                local top = r(max) - `unit'
                                local max = r(max)
                                if ("`range'" != "") {
                                    gettoken bot top : range
                                    if (`bot' < r(min) | `top'> r(max)) {
                                        di as error "Range `bot' to `top' extends outside valid values"
                                        error 121
                                    }
                                } 
                                if abs(mod(`top'-`bot'+1,`unit')) > 0.01 {
                                    di as error "Range `bot' to `top' should span a multiple of `unit'"        
                                    error 121
                                }
                            // ------------------ Myers' Blended Index -------------------
                            
                                // last digit and Myers' weight
                                tempname lastdigit mwgt matcell myers
                                gen `lastdigit' = mod(`age',`unit')
                                label var `lastdigit' "Last digit"
                                gen `mwgt' = `unit'
                                quietly replace `mwgt' = `age' -`bot'+1  if `age' < `bot'+ `unit'
                                quietly replace `mwgt' = `top' + `unit' - `age' if `age' > `top'
                                quietly replace `mwgt' = 0 if `age' < `bot' | `age' > `top' + `unit' - 1
                                //debug    list `age' `mwgt' 
                                
                                // multiply Myers's and user's weights
                                if ("`exp'" != "") {
                                    local fweight = substr("`exp'",2,.) // eliminate equal sign
                                    quietly replace `mwgt' = `mwgt' * `fweight'
                                }
                            
                                // tabulate and compute index
                                tabulate `lastdigit' `if' `in' [fw=`mwgt'], matcell(`matcell')
                            mata: f = st_matrix("`matcell'")
                            mata: m = 100*sum(abs( f/sum(f) :- 1/`unit'))/2 
                            mata: st_numscalar("`myers'", m )
                            
                                // print results
                                di
                                di "{text}Myers' Blended Index = {result}" `myers'
                                if (`top' + `unit' - 1 > `max') {
                                    di "{text}(Assuming zero frequencies for values " `max'+1 " to " `top'+`unit'-1 ")"
                                } 
                            
                                // saved results
                                if ("`generate'" != "") {
                                    gen `generate' = `mwgt'
                                    label var `generate' "Myers's blended weight times `fweight'"
                                }
                                return scalar myers = `myers'
                            end

                            Comment


                            • #15
                              And here is another resource: https://userforum.dhsprogram.com/ind...art=0&S=Google. And in the last page there is an attachment Myerstest_DHS_data_mlogit_do_24Mar2021.txt

                              Comment

                              Working...
                              X