Hello Statalisters,
I am using Stata/IC 15.0 for Mac (64-bit Intel). I have time series data of an emotions survey that is embedded within a hypermedia science tutoring and assessment platform. The emotions values are Likert type, 1-5 range.
In the long term I want to explore measurement invariance across time points. First step I understand to be using EFA at each time point to see how many factors may exist. Here is the code I am using for the 6 time points.
foreach i of num 1/6 {
factor P_enjoyment`i' P_hopeful`i' P_proud`i' P_surprised`i' P_curious`i' N_frustrated`i' N_anxious`i' N_ashamed`i' N_hopeless`i' N_bored`i' N_confused`i' N_sad`i' , ml blanks (.3)
screeplot
graph export xscree`i'.png, replace
rotate, oblimin blanks(.3)
}
On all the timepoints except 5, Heywood case is encountered. This seems to be associated with the fact that eigenvalues do not descend in the correct order. For example (see eigenvalue in red):
Factor analysis/correlation Number of obs = 189
Method: maximum likelihood Retained factors = 7
Rotation: (unrotated) Number of params = 63
Schwarz's BIC = 331.613
Log likelihood = -.6915937 (Akaike's) AIC = 127.383
Beware: solution is a Heywood case
(i.e., invalid or boundary values of uniqueness)
--------------------------------------------------------------------------
Factor | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Factor1 | 1.92857 -1.93710 0.2316 0.2316
Factor2 | 3.86566 2.71819 0.4643 0.6960
Factor3 | 1.14748 0.61079 0.1378 0.8338
Factor4 | 0.53669 0.20229 0.0645 0.8982
Factor5 | 0.33440 0.02817 0.0402 0.9384
Factor6 | 0.30622 0.09969 0.0368 0.9752
Factor7 | 0.20654 . 0.0248 1.0000
--------------------------------------------------------------------------
LR test: independent vs. saturated: chi2(66) = 1046.63 Prob>chi2 = 0.0000
LR test: 7 factors vs. saturated: chi2(3) = 1.31 Prob>chi2 = 0.7259
(tests formally not valid because a Heywood case was encountered)
I understand Heywood cases to be concerned with lack of variance in the data. However I am not sure how to solve this issue. Is it that the responses may not have sufficient variation for EFA to be accurately estimated? Is EFA the wrong choice for the Likert data? Maybe I should be using polychoric correlation in EFA instead....?
In advance I appreciate any advice the forum can offer.
Many thanks,
Jeanne
I am using Stata/IC 15.0 for Mac (64-bit Intel). I have time series data of an emotions survey that is embedded within a hypermedia science tutoring and assessment platform. The emotions values are Likert type, 1-5 range.
In the long term I want to explore measurement invariance across time points. First step I understand to be using EFA at each time point to see how many factors may exist. Here is the code I am using for the 6 time points.
foreach i of num 1/6 {
factor P_enjoyment`i' P_hopeful`i' P_proud`i' P_surprised`i' P_curious`i' N_frustrated`i' N_anxious`i' N_ashamed`i' N_hopeless`i' N_bored`i' N_confused`i' N_sad`i' , ml blanks (.3)
screeplot
graph export xscree`i'.png, replace
rotate, oblimin blanks(.3)
}
On all the timepoints except 5, Heywood case is encountered. This seems to be associated with the fact that eigenvalues do not descend in the correct order. For example (see eigenvalue in red):
Factor analysis/correlation Number of obs = 189
Method: maximum likelihood Retained factors = 7
Rotation: (unrotated) Number of params = 63
Schwarz's BIC = 331.613
Log likelihood = -.6915937 (Akaike's) AIC = 127.383
Beware: solution is a Heywood case
(i.e., invalid or boundary values of uniqueness)
--------------------------------------------------------------------------
Factor | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Factor1 | 1.92857 -1.93710 0.2316 0.2316
Factor2 | 3.86566 2.71819 0.4643 0.6960
Factor3 | 1.14748 0.61079 0.1378 0.8338
Factor4 | 0.53669 0.20229 0.0645 0.8982
Factor5 | 0.33440 0.02817 0.0402 0.9384
Factor6 | 0.30622 0.09969 0.0368 0.9752
Factor7 | 0.20654 . 0.0248 1.0000
--------------------------------------------------------------------------
LR test: independent vs. saturated: chi2(66) = 1046.63 Prob>chi2 = 0.0000
LR test: 7 factors vs. saturated: chi2(3) = 1.31 Prob>chi2 = 0.7259
(tests formally not valid because a Heywood case was encountered)
I understand Heywood cases to be concerned with lack of variance in the data. However I am not sure how to solve this issue. Is it that the responses may not have sufficient variation for EFA to be accurately estimated? Is EFA the wrong choice for the Likert data? Maybe I should be using polychoric correlation in EFA instead....?
In advance I appreciate any advice the forum can offer.
Many thanks,
Jeanne
Comment