Dear Stata Users,
I have 19 variables that ask individuals if they experienced or not (0=no, 1=yes) a specific event. Example of events include divorce, death of a loved one, injury...
I am using these 19 variables as predictors of a health outcome. I want to reduce the number of predictors and usually I've used factor analysis to identify latent variables that I use as predictors instead of 19 separate variables.
I realize that factor analysis is not recommended and not appropriate for binary variables. In my searches I came across the following documentation from UCLA about polychoric correlations. I also read the stata help file after installing the polychoric package by Stas Kolenikov.
Based on this brief description of my data, do you think that using polychoric correlations is appropriate? Will it actually make sense to create "out of the correlations" latent variables that would include predictors that measure similar life events? For example: 1 identified factor could be family life events which would represent divorce, death, serious sickness of a family member. Another identified factor could be resources which would represent: loss of a job, lack of transportation, loss of health insurance. FYI: My life events are less intuitive to categorize into latent variables than the examples I provided.
Any other suggestions are welcome.
Thank you for your time,
Patrick
I have 19 variables that ask individuals if they experienced or not (0=no, 1=yes) a specific event. Example of events include divorce, death of a loved one, injury...
I am using these 19 variables as predictors of a health outcome. I want to reduce the number of predictors and usually I've used factor analysis to identify latent variables that I use as predictors instead of 19 separate variables.
I realize that factor analysis is not recommended and not appropriate for binary variables. In my searches I came across the following documentation from UCLA about polychoric correlations. I also read the stata help file after installing the polychoric package by Stas Kolenikov.
Based on this brief description of my data, do you think that using polychoric correlations is appropriate? Will it actually make sense to create "out of the correlations" latent variables that would include predictors that measure similar life events? For example: 1 identified factor could be family life events which would represent divorce, death, serious sickness of a family member. Another identified factor could be resources which would represent: loss of a job, lack of transportation, loss of health insurance. FYI: My life events are less intuitive to categorize into latent variables than the examples I provided.
Any other suggestions are welcome.
Thank you for your time,
Patrick
Comment