
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Principal Component Analysis (Creating an Index using Multiple Scores)

    Hello, everyone. I am computing an index using Principal Component Analysis. After running the PCA command on Stata, I observed first two components being greater than one, meaning first two components explain most of the variation. I have generated the two scores using the predict option This brings me to my question- how to use the first two components to create the index? Should I take an average of the scores? Note that the first components explains 50% and second almost 30%, so just an average could be a wrong approach if I am not wrong. Can you guide me to the right approach Thanks in advance.

  • #2
    PC1 is the best single summary of the data on the criteria used in PCA. You won't improve on it by mushing together two or more components. The point is that PC1 is already a weighted mean of variables, so it summarizes the interdependence of all the variables it looks at.. Also, you're proposing to average two variables that are by definition uncorrelated....

    Not the same issue, but I find PCA massively oversold in some circles, or at least the subject of unrealistic expectations about what it can do. What can make a lot more sense is to use PCA as a way to look at correlation structure and then make an informed decision on which of several variables to use in the next step of an analysis. Also, while throwing lots of predictors into a regression model wily-nilly is often scorned as mindless and unthinking, throwing lots of predictors into a PCA and expecting it to make choices for you is not much more mindful.

    The thinking "I have several measures that are all measures of <whatever>, so I shouldn't use them all" is often good, but there are several ways to make that selection. Also, they can turn out to be less highly correlated than you imagine.


    • #3
      Thanks Nick. Could you suggest any other alternative methods of selection? Would you advise me on selection of weights of variables and taking on the average on the basis of reasoning, like it would be subjective but based on (economic) reasoning?


      • #4
        I think you've answered your own question. Economic reasoning is what should guide selection of variables. together with what works. That's almost empty advice, but I don't think I can do better.

