Dear All,
I have a panel dataset, the dependent variable (Y) is contineous and all my explanatory variables are binary taking the value of 1 or 0. I have followed a specialized literature that uses machine learning based least absolute shrinkage and selection operator (LASSO) method that identifies relevant set of dummy explantory variables that have non-negligible impact on Y. However the set of these choosen dummy explanatory variables is still large around 34. Because of high multicollinearity between the dummy variable and overfitting problem, it is inadvisible to include all the relevant 34 variable additively in the model.
Having said that, I want to use Principal Component Analysis (PCA) to combine these multiple factors (dummies) into one single factor. I was looking for which PCA method works best for such dataset with binary variables.? There are various alternatives available to combine multiple variables into a single factor using PCA such as; pca, polychoricpca, tetrachoric, multiple correspondance analysis (mca), factor analysis (factor) etc. I am not sure which method is more suitable with the given data.
I shall be thankful for any suggestions or recommendations.
Thanks and regards,
(Ridwan)
I have a panel dataset, the dependent variable (Y) is contineous and all my explanatory variables are binary taking the value of 1 or 0. I have followed a specialized literature that uses machine learning based least absolute shrinkage and selection operator (LASSO) method that identifies relevant set of dummy explantory variables that have non-negligible impact on Y. However the set of these choosen dummy explanatory variables is still large around 34. Because of high multicollinearity between the dummy variable and overfitting problem, it is inadvisible to include all the relevant 34 variable additively in the model.
Having said that, I want to use Principal Component Analysis (PCA) to combine these multiple factors (dummies) into one single factor. I was looking for which PCA method works best for such dataset with binary variables.? There are various alternatives available to combine multiple variables into a single factor using PCA such as; pca, polychoricpca, tetrachoric, multiple correspondance analysis (mca), factor analysis (factor) etc. I am not sure which method is more suitable with the given data.
I shall be thankful for any suggestions or recommendations.
Thanks and regards,
(Ridwan)
Comment