So I am using panel data and plan to use a fixed-effect model, but some of my independent variables are strongly correlated. My equation is specified as follows:
and the ICT variables I used have strong collinearity, so I plan to use PCA to reduce/remove multicollinearity, and my question is should I use all my independent variables when I run a PCA or include only the correlated variables (i.e ICT variables), then predict an index for it to replace the correlated variables before running a regression?
Or in other words, can I do this:
run a PCA for variables TELit , MOBit , BROit ,INTit , then predict PC1 (assuming the results says the first component only)
run another PCA for variables TELjt , MOBjt , BROjt ,INTjt, then predict the score
the two different scores/indexes I predicted will be used to substitute the said correlated ICT variables, then perform FEM regression afterward. Also, should I transform the scores/indexes into log form to be consistent with other control variables in my model?
Thank you in advance!
ln EXPijt = ꞵ0 + ꞵ1 lnGDPit + ꞵ2 lnGDPjt + ꞵ3 lnPOPit + ꞵ4 lnPOPjt +
ꞵ5 lnDISTij + ꞵ6 lnTELit + ꞵ7 lnTELjt + ꞵ8 lnMOBit + ꞵ9 lnMOBjt +
ꞵ10 lnBROit + ꞵ11 lnBROjt + ꞵ12 lnINTit + ꞵ13 lnINTjt + ꞵ14BORij + ꞵ15Langij + ꞵ16RTAij +εit
and the ICT variables I used have strong collinearity, so I plan to use PCA to reduce/remove multicollinearity, and my question is should I use all my independent variables when I run a PCA or include only the correlated variables (i.e ICT variables), then predict an index for it to replace the correlated variables before running a regression?
Or in other words, can I do this:
run a PCA for variables TELit , MOBit , BROit ,INTit , then predict PC1 (assuming the results says the first component only)
run another PCA for variables TELjt , MOBjt , BROjt ,INTjt, then predict the score
the two different scores/indexes I predicted will be used to substitute the said correlated ICT variables, then perform FEM regression afterward. Also, should I transform the scores/indexes into log form to be consistent with other control variables in my model?
Thank you in advance!
Comment