Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Finding the Disimilarity Index in a multigroup context

    I am working on calculating the Dissimilarity Index (DI) in a multigroup analysis using Carlos Gradin's codes for measuring local segregation, which is based on the method developed by Alonso-Villar & Del Rio (2010)'s Method. However, the local segregation results from the code do not provide the DI for each group (in my case, male and female).

    I attempted to manually implement the DI formula from Alonso-Villar & Del Rio (2010), but the results I obtained are not within the expected range of 0 to 1. My dataset consists of individual-level observations. Here is the code I used:

    Code:
    gen count = 1
    
    #Calculate the Total Number of Males and Females (JK) in Each Category (combination between occupations and industries):
    bysort composite_category JK: egen group_in_category = total(count) 
    
    #Calculate Overall Totals for Males and Females
    bysort JK: egen group_total = total(count) 
    
    #Proportion of males or females in each category
    gen prop_group_in_category = group_in_category / group_total 
    
    #Calculate the Total Number of Individuals in Each Category 
    bysort composite_category: egen total_in_category = total(count) 
    
    #Calculate the Population for the Other
    gen rest_in_category = total_in_category - group_in_category 
    
    #Proportion of the other in each category
    gen prop_rest_in_category = rest_in_category / (group_total) 
    
    #Calculate the absolute difference between the two proportions
    gen abs_diff = abs(prop_group_in_category - prop_rest_in_category) 
    
    #Sum Up the Absolute Differences Across All Categories:
    bysort composite_category: egen category_abs_diff = total(abs_diff)
    Can anyone help me figure out why my DI results are incorrect?

  • #2
    Apologies, I also included this code at the end

    Code:
    #Calculate the D-Index for each gender
    gen D_index = total_abs_diff / 2

    Comment


    • #3
      Once you've calculated the proportions, the index wanted is a sum over categories and not over observations, That is, each category enters the calculation just once.

      Also, watch out for categories that don't occur in one but do occur in the other set. Then you want a term that is | zero - positive | or (positive - zero | (same number, naturally) included to the calculation. I haven't studied your code carefully enough to know whether you avoid that pitfall.

      A simplification is that you don't need the variable count at all, as you can do things like this.

      Code:
      bysort composite_category JK: gen group_in_category = _N
      I don't know why you're coding up something already coded in many places.

      Comment

      Working...
      X