Hi all,
I was hoping to get some input on the appropriateness of principal components analysis in an analysis of experimental data.
I recently came across an analysis that uses PCA to build a "composite" variable for dependent variables that measure related sets of evaluations, such as trust or specific types of beliefs. The data is from an experiment, in which they test different messages' effects on these attitudes. The goal is to determine which messages are most effective at moving respondents' opinions in the desired direction across a range of related evaluations,
In my understanding, PCA is used as a data reduction tool when you have a number of measures that you are likely highly correlated. So, PCA enables you to build a more parsimonious model by identifying patterns in these correlated variables and then building a new "composite" measure that retains the information from variables that account for the most variation. This also addresses multicollinearity concerns because the fewer remaining variables included in the PCA are also uncorrelated/orthogonal.
Soooo, I completely understand the value in using PCA to build independent variables, particularly when over-specification or multicollinearity is an issue. HOWEVER, I am circumspect of using it to build dependent variables. Especially in an experiment in which testing for how different conditions affect these various evaluations differently (or not at all) is the point of the whole analysis. Aren't you just throwing useful data away using PCA to form dependent variables? Isn't uncovering and understanding that underlying variation between your dependent variables across different conditions the whole point? And, if you were to build a composite variable, would an additive or averaged variable be better? When I do see composite dependent variables in analyses of experimental results, I feel as though they are more often additive or averages I can't recall ever seeing PCA used in this way.
Am I missing something? Is there value in using PCA to build dependent variables in this case? What would you say is the preferable way to build a composite dependent variable measuring related attitudes in an experiment?
Any thoughts or feedback anyone might have would be very appreciated.
Thank you!!!
I was hoping to get some input on the appropriateness of principal components analysis in an analysis of experimental data.
I recently came across an analysis that uses PCA to build a "composite" variable for dependent variables that measure related sets of evaluations, such as trust or specific types of beliefs. The data is from an experiment, in which they test different messages' effects on these attitudes. The goal is to determine which messages are most effective at moving respondents' opinions in the desired direction across a range of related evaluations,
In my understanding, PCA is used as a data reduction tool when you have a number of measures that you are likely highly correlated. So, PCA enables you to build a more parsimonious model by identifying patterns in these correlated variables and then building a new "composite" measure that retains the information from variables that account for the most variation. This also addresses multicollinearity concerns because the fewer remaining variables included in the PCA are also uncorrelated/orthogonal.
Soooo, I completely understand the value in using PCA to build independent variables, particularly when over-specification or multicollinearity is an issue. HOWEVER, I am circumspect of using it to build dependent variables. Especially in an experiment in which testing for how different conditions affect these various evaluations differently (or not at all) is the point of the whole analysis. Aren't you just throwing useful data away using PCA to form dependent variables? Isn't uncovering and understanding that underlying variation between your dependent variables across different conditions the whole point? And, if you were to build a composite variable, would an additive or averaged variable be better? When I do see composite dependent variables in analyses of experimental results, I feel as though they are more often additive or averages I can't recall ever seeing PCA used in this way.
Am I missing something? Is there value in using PCA to build dependent variables in this case? What would you say is the preferable way to build a composite dependent variable measuring related attitudes in an experiment?
Any thoughts or feedback anyone might have would be very appreciated.
Thank you!!!
Comment