Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • difference between alpha, gen() and using egen rowmean()?

    Hi all,

    I've been under the impression from the stata manual on the alpha command that the generate option sums the values over list items and divides the sum by the number of items. However, I'm getting different results when creating a scale by using the alpha command's generate option and when I manually create the scale by averaging the values over the scale items. I provide a sample of the data below. For example, when I use alpha's generate command, I get a questscore of 2.5 for id=111, whereas the egen rowmean()command produces a questscore of 3.

    Here's what I'm doing. Am I missing something about the way the alpha, gen() command works?


    Code:
    clear
    input int id long(quest1 quest2 quest3 quest4 quest5 quest6 quest7 quest8 quest9 quest10 quest11 quest12)
    111 3 3 3 3 3 3 3 3 3 3 3 3
    112 3 3 3 4 3 5 5 3 1 3 4 4
    113 2 4 4 3 2 2 3 2 3 2 5 2
    114 3 4 2 1 3 2 2 3 5 2 3 3
    115 3 3 3 3 3 3 3 3 3 3 3 3
    116 3 3 2 3 3 3 2 3 2 3 2 4
    117 4 3 3 2 4 4 3 5 3 3 4 4
    118 3 3 3 3 3 1 1 3 1 4 5 3
    119 3 2 3 3 3 3 3 3 3 3 3 3
    120 3 3 3 3 3 3 3 3 3 3 3 3
    end
    
    local q "quest1 quest2 quest3 quest4 quest5 quest6 quest7 quest8 quest9 quest10 quest11 quest12"
    alpha `q',   gen(questscore) 
    
    gen quest7r = 6-quest7 //this item is reverse coded in the original
    gen quest11r = 6-quest11 //this item is reverse coded in the original
    local q "quest1 quest2 quest3 quest4 quest5 quest6 quest7r quest8 quest9 quest10 quest11r quest12"
    egen questscore_e = rowmean(`q')

  • #2
    not quite - -alpha- with the gen option divides by the number of items that are non-missing; -egen- treats missings as 0 and thus the results from the 2 will differ

    Comment


    • #3
      Hi Rich,

      Thank you for the prompt answer. I don't have any missing values on these 12 variables for any observation. So, I'm wondering why the outcome would be different still.

      Kerby

      Comment


      • #4
        well, I did the following: I copied what you did but found that 5 variables were reversed:
        Code:
        . local q "quest1 quest2 quest3 quest4 quest5 quest6 quest7 quest8 quest9 quest10 quest11 quest12"
        
        . alpha `q',   gen(questscore)
        
        Test scale = mean(unstandardized items)
        Reversed items:  quest2 quest3 quest4 quest9 quest11
        
        Average interitem covariance:     .1018519
        Number of items in the scale:           12
        Scale reliability coefficient:      0.6988
        so, why were 5 items reversed when I did it but, apparently, only 2 (7 (not reversed above) and 11 in your report? possibly your full data set is much larger and only 7 and 11 are reversed in the larger data set?

        more importantly, I looked at the "methods and formulas" section of the manual and saw that the scale sum from alpha is not at all what -egen- does; there is no reason to expect them to be the same

        Comment


        • #5
          Aah, I see. The summation process of the -generate- option of -alpha- treats the values of the reversed items as negative, subtracting rather than adding them. I had thought that it reverse coded the item itself such that it added a 1 instead of a 5, for instance. In the -egen- -rowmean- command, all items are added, obviously. This detail was not obvious from the description in the options section of the manual.

          Yes, 7 and 11 are reversed in the larger dataset by construction. I realized that error in the code I posted after the fact.

          Thanks for attending to my question!

          Kerby

          Comment


          • #6
            Originally posted by Rich Goldstein View Post
            not quite - -alpha- with the gen option divides by the number of items that are non-missing; -egen- treats missings as 0 and thus the results from the 2 will differ
            So if I were to use the 'egen' command over the 'alpha' command, all my missing values would compute to 0? What if I went ahead and already computed my missing values as '.'

            Comment


            • #7
              No, -egen newvar = rowmean(varlist)- will not treat missings as zero but will divide the sum of valid values by the number of valid values. The result can be different, see this simple example:
              Code:
              clear
              input v1 v2 v3 v4
               1 2 3 4
               1 0 3 4
               1 . 3 4
              end
              
              egen score = rowmean(v1 v2 v3 v4)
              list
              The result will be
              Code:
                   +------------------------------+
                   | v1   v2   v3   v4      score |
                   |------------------------------|
                1. |  1    2    3    4        2.5 |
                2. |  1    0    3    4          2 |
                3. |  1    .    3    4   2.666667 |
                   +------------------------------+
              To your comment in #5: -alpha- can't reverse code your items as you intend it to do (you want to recode the values 1 to 5 into 5 to 1) because it can't anticipate the "theoretical" endpoints of your scale (theoretical because empirically they may range between 2 and 4 although you are using a Likert scale ranging from 1 to 5). Hence -alpha- simply subtracts the values of variables that correlate negatively with the total score.

              By the way: You should add the option item to the -alpha- command in order to see which items you should reverse code or which items will be subtracted from the total score if you use the option gen().

              Comment

              Working...
              X