Index of qualitative variation

emanuele fedeli

Join Date: Jun 2016

Posts: 7
#1

Index of qualitative variation

18 Jun 2016, 04:34

Hello,

I want to use an index of qualitative variation (nominal variables) in Stata. Is there a specific command?

Thank you in advance
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35433
#2

18 Jun 2016, 05:00

I think this is one of about 20 or so names in various literatures for a sum of squared proportions. (If you want the complement or reciprocal of that, that's one easy step further.)

See e.g. http://www.statalist.org/forums/foru...index-in-stata

If that doesn't solve the problem please give us an exact definition (equation, not words) and some sample data.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4439
#3

18 Jun 2016, 05:00

never heard of it - could you supply some more info (e.g., a cite)? there are a lot of user-written commands for different measures of variability
Comment
emanuele fedeli

Join Date: Jun 2016

Posts: 7
#4

20 Jun 2016, 04:36

Hi, my main goal it to measure "variability" for ordinal variables. Below, you have an example.

I wanted to do apply this formula but it does not work:

gen b1 = 1-((p1^2)+(p2^2)+(p3^2)+(p4^2)+(p5^2)+(p6^2))

tab education

*parents' highest education level* | Freq. Percent Cum.
----------------------------------------+-----------------------------------
university or higher | 4,192 31.99 31.99
post-secondary | 3,020 23.05 55.04
upper secondary | 3,106 23.70 78.74
lower secondary | 2,508 19.14 97.88
some primary,lower secondary | 237 1.81 99.69
not applicable | 41 0.31 100.00
----------------------------------------+-----------------------------------
Total | 13,104 100.00
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35433
#5

20 Jun 2016, 04:56

So, this is already answered in the thread linked within #2 with the information that your index is 1 minus what is there called the Herfindahl-Hirschman index.

Here's how to do it in Stata with Mata:

Code:

sysuse auto, clear tab rep78, matcell(freq) mata freq = st_matrix("freq") 1 - sum((freq :/ sum(freq)):^2) end

In this case Mata shows .7032136106

In your case, it's a matter of naming education as the variable.
Comment
Nick Allum

Join Date: Sep 2016

Posts: 1
#6

23 Sep 2016, 10:32

I'd like to get bootstrapped standard errors for this statistic. Is there a way of plugging the mata program above into a bootstrap command? Or if not, is there another way of doing it? (I have an experiment where the outcome variable is nominal and I want to compare treatment and control groups on this statistic).

Thanks in advance.
Nick
Comment
Alan Neustadtl

Join Date: Mar 2014

Posts: 107
#7

23 Sep 2016, 14:37

Originally posted by emanuele fedeli View Post

gen b1 = 1-((p1^2)+(p2^2)+(p3^2)+(p4^2)+(p5^2)+(p6^2))

I have seen this referred to as a diversity index. I believe that the index of qualitative variation is a standardized version of the diversity index that ranges from 0 to 1 so you can compare variability across measures with different numbers of outcome categories. That can be calculated as:

Code:

gen b1 = (k/k-1)*1-((p1^2)+(p2^2)+(p3^2)+(p4^2)+(p5^2)+(p6^2)))

Where k is the number of outcome categories.

Best,
Alan
Comment
Mike Lacy

Join Date: Apr 2014

Posts: 2404
#8

25 Sep 2016, 10:07

The IQV, as many sociologists term it, is available in -divcat- (available from SSC) and is described as the "normalized generalized variance." However, since you have an ordinal variable, I suggest you consider the measures of ordinal dispersion implemented in my module -ordvar-, also available at SSC.

To amplify Nick's comment, I might venture that no other statistic has been (re)invented as many times as the IQV.
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35433

28 Sep 2016, 00:49

To answer #6 directly, the answer is surely yes. Here's a minimal code example.

Code:

*! 1.0.0 NJC 28 Sept 2016 
program iqv, rclass  
    syntax varname [if] [in] 
    marksample touse, strok 
    tempname freq iqv 
    qui tab `varlist' if `touse', matcell(`freq') 
    mata: freq = st_matrix("`freq'") 
    mata: st_numscalar("`iqv'", 1  - sum((freq :/ sum(freq)):^2)) 
    di _n  "IQV: " %4.3f `iqv'
    return scalar iqv = `iqv' 
end

. sysuse auto
(1978 Automobile Data)

. bootstrap r(iqv) , reps(100) : iqv rep78
(running iqv on estimation sample)

Warning:  Since iqv is not an estimation command or does not set e(sample),
          bootstrap has no way to determine which observations are used in
          calculating the statistics and so assumes that all observations are
          used.  This means no observations will be excluded from the
          resampling because of missing values or other reasons.

          If the assumption is not true, press Break, save the data, and drop
          the observations that are to be excluded.  Be sure that the dataset
          in memory contains only the relevant data.

Bootstrap replications (100)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50
..................................................   100

Bootstrap results                               Number of obs      =        74
                                                Replications       =       100

      command:  iqv rep78
        _bs_1:  r(iqv)

------------------------------------------------------------------------------
             |   Observed   Bootstrap                         Normal-based
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _bs_1 |   .7032136   .0327004    21.50   0.000      .639122    .7673053
------------------------------------------------------------------------------

. estat bootstrap

Bootstrap results                               Number of obs      =        74
                                                Replications       =       100

      command:  iqv rep78
        _bs_1:  r(iqv)

------------------------------------------------------------------------------
             |    Observed               Bootstrap
             |       Coef.       Bias    Std. Err.  [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _bs_1 |   .70321361  -.0074416   .03270042    .6309599   .7529844  (BC)
------------------------------------------------------------------------------
(BC)   bias-corrected confidence interval

In your case, the approach might need to be extended e.g. to get a CI for the difference of two measures.

Announcement

Index of qualitative variation

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment