Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • calculating Cronbach Alpha

    I am establishing the psychometric properties of a measurement tool that i created. One of the components is to estimate Cronbach alpha of the tool. I use the following code for it (the given variables are all the variables in the tool):

    alpha dignty_1cc dignty_2cc dignty_4cc autonomy_1cc autonomy_2cc autonomy_3cc autonomy_4cc autonomy_5b autonomy_6b autonomy_7b confid_1b confid_2cc confid_3cc qual_1b qual_3cc qual_4cc qual_5cc qual_6b qual_7cc qual_8cc qual_9b qual_10b qual_11cc qual_12cc support_1cc support_2b support_3cc support_4cc choice_1b choice_2b choice_3cc choice_5b access_2cb access_3cc access_4cc access_6cb communicate_1cb communicate_2b communicate_3cc communicate_4cc communicate_5cc finance_1b finance_2b finance_3b finance_4b continuity_2b continuity_4b continuity_5cc

    I get the error: "cannot determine the sense empirically; must specify option asis or reverse()
    r(459);"

    When i Apply the "asis" function; it says "no observtaions, r(2000)"

    However, when i remove the variable communicat_1cb; the same code runs and gives me the Cronbach alpha value. i am not sure what is going on. I cannnot leave out the variable of communicate_1cb; as it is a part of the tool. Please suggest. Thank-you.

    regards
    Meesha

  • #2
    You don't show any example data, so the best anyone can do to help is make a guess about what's going wrong. Something apparently is wrong with the variable communicate_1cb. I can see two likely possibilities:
    1. It is a string variable. -alpha- requires all the variables be numeric.
    2. It has missing values in all observations.

    To check these possibilities, let's deal with 2 first. Run -assert missing(communicate_1cb)-. If communicate_1cb has missing values in all observations, this command will give you no output at all. And then you know the source of the problem--you have no data for this variable. The solution will be to get the data from somewhere--perhaps during the data management that created your data set something was done that wiped out the contents of this variable. If, however, the -assert- command gives you an error message "assertion is false," then there are at least some non-missing values. In that case, that isn't the problem, so you should proceed to the next step.

    Run -des communicate_1cb-. If it is a string variable, the second column in the output table from that command will say str# (where # is some number) or strL. Either of those tells you it is a string. So you will have to convert it to a numeric variable. The correct way to do that depends on what the contents of that variable are. Run -browse communicate_1cb- and visually look at its values. If they look to your eyes like numbers, then use the -destring- command. (-help destring- if you don't know how to use that one.) If, however, it looks like words, e.g. "Strongly Agree", or "Never", etc., then you need to use -encode-. But be careful with -encode-. Cronbach's alpha expects the variables it processes to be not only numeric but really quantitative. But -encode-, designed with purely categorical (nominal) variables in mind, by default, assigns numbers to the text values in alphabetical (actually ASCII) sort order, which will generally be the wrong thing to do in your context. So you will need to first create a value label that translates the text values to numbers in an appropriate numerical order and use that label in the -encode- command's -label()- option.

    If these suggestions don't resolve your problem, please post back with an example of your data. You have a large number of variables, too large to properly post an example of all of them. But make sure you include the problematic communicate_1cb variable among them. Also verify that the example data you post reproduces the same problem you are having. And, finally, since people will need to work with the data, it is crucial that you post it in a usable form. The most usable form of data posting here is with the -dataex- command.* If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    *Whatever you do, don't post a screenshot, and don't attach a file.

    Comment


    • #3
      Dear Clyde,
      I appreciate your detailed response. I have checked both things you mentioned: all variables are float variables, there is no string variable. I also checked for missing data: 29% of the observations are missing in the variable of communicate_1cb. I am not sure if this is the reason, since there are other variables too that have missing data.
      I have tried to use "dataex" command to post here, as you had guided. HOwever, it gives me an error saying "input statement exceeds linesize limit. Try specifying fewer variables r1000" I am not sure how t opost here without screenhots or attaching a file or using the dataex command. Thank-you again.
      Regards
      Meesha

      Comment


      • #4
        I forgot to mention that if I use less number of variables for dataex to post here (including the problematic variable); i do not get the error. Alpha works with less number of variables. It is only when i cluster them all together, that i get the error.

        Comment


        • #5
          OK, I checked the source code in alpha.ado, and I think I know what the problem is. Alpha uses listwise deletion to form the estimation sample. That is, whenever an observation has a missing value in any of the variables in the scale whose reliability you are calculating, that observation is excluded from the estimation sample. So, what must be happening is that you have just enough missing values scattered among the variables in such a way that when you try to include all the variables, every observation in the data set has a missing value in at least one of them. As it happens, when you remove some of the variables from the command, you are left with a few observations that have complete data and so you get a result.

          You can test my hypothesis as follows:

          Code:
          egen int mcount = rowmiss(dignty_1cc dignty_2cc dignty_4cc autonomy_1cc autonomy_2cc autonomy_3cc autonomy_4cc autonomy_5b autonomy_6b autonomy_7b confid_1b confid_2cc confid_3cc qual_1b qual_3cc qual_4cc qual_5cc qual_6b qual_7cc qual_8cc qual_9b qual_10b qual_11cc qual_12cc support_1cc support_2b support_3cc support_4cc choice_1b choice_2b choice_3cc choice_5b access_2cb access_3cc access_4cc access_6cb communicate_1cb communicate_2b communicate_3cc communicate_4cc communicate_5cc finance_1b finance_2b finance_3b finance_4b continuity_2b continuity_4b continuity_5cc)
          assert mcount > 0
          The first command gives a count of the number of variables in each observation that have a missing value. The -assert- command verifies that this number is always greater than 0.

          What to do? The ideal is to get more, and better, data. Usually that's not feasible, or you would have done that in the first place. But, in this case, I think you need a different plan. You are trying to calculate alpha for a scale with 48 variables. That's really not a good idea. Imagine you had complete data on all your observations. Let c be the average of all the correlation coefficients between pairs of the 48 variables. Then alpha = k*c/[1+(k-1)*c], where k is the number of variables. Now, if you dust off your calculus, it is not hard to see that, holding c fixed and c != 0, alpha --> 1 as k --> infinity. At k = 48, you are not all that far from infinity. If c is a measly 0.1, alpha will be 0.84, which looks impressive. If c is 0.05, a suggestion that these variables have almost nothing to do with each other, alpha comes out 0.72, which is still impressive. In other words, reliability in the Cronbach sense is a function both of the intercorrelation of the variables and the number of them. In a scale with 48 items, things will look extremely reliable even for a scale that has almost no internal coherence.

          Moreover, looking at the names of your variables, if they are reasonably descriptive of what the variables represent, this looks like a scale with perhaps 10 subscales. I think I would just abandon Cronbach's alpha and proceed to factor analysis for a better view of what is going on here.

          Comment


          • #6
            That makes perfect sense. Thank-you so much for your detailed response. Appreciate your help; I will proceed to factor analysis.
            Best
            Meesha

            Comment

            Working...
            X