Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • CFA Goodness of fit

    Dear Statalist respected users,

    I am trying to estimate a latent variable using 5 observed variables via CFA. the syntax was as follows:

    sem (AC -> totalassets, ) (AC -> lev, ) (AC -> FCF, ) (AC -> logbm, ) (AC -> industries, ), method(adf) latent(AC ) cov( e.FCF*e.lev e.logbm*e.FCF) nocapslatent

    The factor loadings are not very good:
    0.41
    0.20
    0.11
    0.70
    0.28

    the goodness of fit tests are excellent:

    chi2_ms(3) = 2.775
    P>chi2 = 0.428

    RMSEA 0.000
    CFI and TLI are 1.000 and 1.005

    SRMR = 0.014

    Can I consider my model as "good" and continue?

    I tried deleting the variables with small factor loadings, but this led to a worse goodness of fit results.

    Your recommendation, please.
    Thanks a lot in advance.



  • #2
    What do you mean when you say the factor loadings are not very good? Not very good in what sense? By what criterion? They are what they are. Did you have some reason to expect different results--perhaps an earlier study?

    I wonder if you are thinking of the common practice in exploratory factor analysis where you are often looking to reduce the dimensionality of a data set and you tend to retain variables with loadings that exceed 0.4 (or some other such stipulated threshold). That does not apply here, and the word "loading" has different meanings in these two contexts. In exploratory factor analysis (e.g. the kind of thing you get from the -factor- command) that loading is the correlation between the factor and the variable. In particular, it is not sensitive to the scale of the variables.

    By contrast, in CFA, the "loading" is a regression coefficient, and its magnitude depends on the scale of the variables. So the application of some arbitrary threshold like 0.4 makes no sense, and could, in any case, be gamed by changing the units.

    Comment


    • #3
      Clyde Schechter
      Thanks a lot for your reply.

      Yes, as far as I understood from my reading, one of the reliability measurement of the latent variable is to retain the variables with coefficients (factor loadings) which are greater than 0.40. But, now I understand your point which makes sense. Thanks a lot for your contribution.

      I have another question if you allow me.

      is it possible to predict and extract latent variables and then reuse them as observed variables in path analysis?
      for example, I have two latent variables, one works as an Independent variable and the other is a mediator. I predicted them first using CFA, then I used them in the path analysis, is this statistically correct? or I need to estimate the latent variables and test the mediation effect in one run?
      Thanks a lot in advance.

      Kind regards,
      Mohammed

      Comment


      • #4
        It is possible to get estimated values of the latent variables and then use those in later analyses as if they were measured variables. See -help sem_predict- for details.

        But it is discouraged. The problem is that it, in effect, treats the values of the latent variables as if they were measured without error, which is far from the truth. It is generally better to simply extend the -sem- model to including the latent variables in additional structural equations.

        Comment


        • #5
          Clyde Schechter
          Thanks a lot for your reply. Much appreciated.
          I tried to estimate the latent variables in addition to the structural equations in one run but the model did not converge. I made sure that I have enough variance in each variable but still, the model doesn't converge.
          is it possible to have the following situation:
          - Latent variables, individually, passed all the goodness of fit tests and when it comes to estimating the latent variables in addition to the structural equations the model doesn't converge?

          Thanks a lot in advance.

          Kind regards,
          Mohammed​​​​​​​

          Comment


          • #6
            Originally posted by Mohammed Kasbar View Post
            Clyde Schechter
            Thanks a lot for your reply. Much appreciated.
            I tried to estimate the latent variables in addition to the structural equations in one run but the model did not converge. I made sure that I have enough variance in each variable but still, the model doesn't converge.
            is it possible to have the following situation:
            - Latent variables, individually, passed all the goodness of fit tests and when it comes to estimating the latent variables in addition to the structural equations the model doesn't converge?

            Thanks a lot in advance.

            Kind regards,
            Mohammed
            SEM can be tricky due to the identification issues. It is possible to use the estimates from the original measurement-only model as start values, which might aid convergence if the only problem was Stata getting confused about start values, e.g.

            Code:
            sem (AC -> totalassets lev FCF logbm industries), method(adf) latent(AC) cov( e.FCF*e.lev e.logbm*e.FCF) nocapslatent
            matrix b = e(b)
            sem (AC -> totalassets lev FCF logbm  industries) ///
            (AC -> `other_DVs'), ///
            method(adf) latent(AC) cov( e.FCF*e.lev e.logbm*e.FCF) nocapslatent from(b)
            The second line saves the betas for all estimated parameters. The bolded code instructs Stata to fit that new model from the saved parameters. Where Stata can't find a parameter (e.g. if it's for a variable you just added), then I forget what it does as its default, but it will independently choose start values for those additional parameters.

            As usual, it helps if you post your full code and any relevant output in code delimiters - they're easy to read and they can be cut and pasted directly into Stata. We might be able to roughly gauge if the model isn't identified. If you post the first or last, say, 20 repetitions of the iteration log, we can also tell roughly what's going on and maybe make a recommendation. Also, it will probably help you to read SEM intro 12; if you have an infinite iteration log that's repeatedly not concave and the log-likelihood isn't changing much, that's often a sign that the model is not identified.
            Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

            When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

            Comment


            • #7
              Weiwen Ng
              Thanks a lot for your reply.
              I allowed some covariances and the model successfully converged. However, the GOF is still poor.
              I will apply your recommendation by using the estimated parameters of the latent variables as starting values.
              Thanks a lot for your cooperation. Much appreciated

              Mohammed

              Comment


              • #8
                Weiwen Ng
                I am trying to apply the code you recommended to ask Stata to use the previous estimations as starting values in another model but it does not run. I always receive this error message

                initial vector: extra parameter c1 found
                specify skip option if necessary



                This is the command syntax I ran:

                sem (AC -> ta, ) (AC -> lev, ) (AC -> FCF, ) (AC -> industries, ), method(adf) latent(AC ) cov( e.FCF*e.lev) nocapslatent ///
                matrix c=e(c) ///
                sem (CG -> BrdIn, ) (CG -> NCIn, ) (CG -> EDComp, ) (CG -> NEDCom, ) (CG -> FemaleNED, ) (CG -> Foreign, ) (CG -> wac_indfb, ) (AC -> ta, ) (AC -> lev, ) (AC -> FCF, ) (AC -> industries, ), covstruct(_lexogenous, diagonal) method(adf) latent(CG AC ) cov( e.NCIn*e.BrdIn e.EDComp*e.BrdIn e.NEDCom*e.EDComp e.Foreign*e.EDComp e.Foreign*e.FemaleNED e.FCF*e.lev) nocapslatent from(c)

                is it possible to estimate the 2 latent variables AC and CG individually and save the estimated parameters and use them later when I estimate my system of equations to test for the mediation effect of interest?

                Thank you

                Comment


                • #9
                  Originally posted by Mohammed Kasbar View Post
                  Weiwen Ng
                  I am trying to apply the code you recommended to ask Stata to use the previous estimations as starting values in another model but it does not run. I always receive this error message

                  initial vector: extra parameter c1 found
                  specify skip option if necessary



                  This is the command syntax I ran:

                  sem (AC -> ta, ) (AC -> lev, ) (AC -> FCF, ) (AC -> industries, ), method(adf) latent(AC ) cov( e.FCF*e.lev) nocapslatent ///
                  matrix c=e(c) ///
                  sem (CG -> BrdIn, ) (CG -> NCIn, ) (CG -> EDComp, ) (CG -> NEDCom, ) (CG -> FemaleNED, ) (CG -> Foreign, ) (CG -> wac_indfb, ) (AC -> ta, ) (AC -> lev, ) (AC -> FCF, ) (AC -> industries, ), covstruct(_lexogenous, diagonal) method(adf) latent(CG AC ) cov( e.NCIn*e.BrdIn e.EDComp*e.BrdIn e.NEDCom*e.EDComp e.Foreign*e.EDComp e.Foreign*e.FemaleNED e.FCF*e.lev) nocapslatent from(c)

                  is it possible to estimate the 2 latent variables AC and CG individually and save the estimated parameters and use them later when I estimate my system of equations to test for the mediation effect of interest?

                  Thank you
                  What you describe should be possible. Inspecting your syntax, I can't see which parameter would be the extra one. Stata was merely telling you that it can't find a match for the estimated parameter; this could happen if you had, for example, an extra indicator for the variable AC. You don't seem to. I'd merely specify the -skip- option. That option just skips any parameters in the start matrices that aren't found in the current command.

                  You can save multiple matrices of start values like this:

                  Code:
                  sem (AC -> ta, ) (AC -> lev, ) (AC -> FCF, ) (AC -> industries, ), method(adf) latent(AC ) cov( e.FCF*e.lev) nocapslatent
                  matrix c=e(b)
                  sem (CG -> BrdIn, ) (CG -> NCIn, ) (CG -> EDComp, ) (CG -> NEDCom, ) (CG -> FemaleNED, ) (CG -> Foreign, ) (CG -> wac_indfb, ), method(adf) latent (CG) nocapslatent cov(e.NCIn*e.BrdIn e.EDComp*e.BrdIn e.NEDCom*e.EDComp e.Foreign*e.EDComp e.Foreign*e.FemaleNED)
                  matrix d = e(b)
                  sem (CG -> BrdIn, ) (CG -> NCIn, ) (CG -> EDComp, ) (CG -> NEDCom, ) (CG -> FemaleNED, ) (CG -> Foreign, ) (CG -> wac_indfb, ) (AC -> ta, ) (AC -> lev, ) (AC -> FCF, ) (AC -> industries, ), covstruct(_lexogenous, diagonal) method(adf) latent(CG AC ) cov( e.NCIn*e.BrdIn e.EDComp*e.BrdIn e.NEDCom*e.EDComp e.Foreign*e.EDComp e.Foreign*e.FemaleNED e.FCF*e.lev) nocapslatent from(b c) skip
                  Do note the recommendation to present code and results in the code delimiters (see my signature for how). It's much easier to read!
                  Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

                  When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                  Comment


                  • #10
                    Thank you all, I have learned a lot from your posts.
                    Cheers ,Hassen

                    Comment


                    • #11
                      Weiwen Ng
                      Thanks a lot for your contribution. Your recommendation makes sense but Stata is still showing the same error message!

                      I saved the parameters of the estimation of each latent variable then I used the option from and I received the same previous error message

                      initial vector: extra parameter c1 found
                      specify skip option if necessary

                      Then, I used the skip option but Stata showed me a different error message stating that skip option is not allowed.

                      Thanks a lot for your contribution again.
                      Much appreciated.

                      Comment


                      • #12
                        First, error in my syntax. Skip is a sub-option to the from option. So, this syntax is correct:

                        Code:
                         
                         sem (CG -> BrdIn, ) (CG -> NCIn, ) (CG -> EDComp, ) (CG -> NEDCom, ) (CG -> FemaleNED, ) (CG -> Foreign, ) (CG -> wac_indfb, ) (AC -> ta, ) (AC -> lev, ) (AC -> FCF, ) (AC -> industries, ), covstruct(_lexogenous, diagonal) method(adf) latent(CG AC ) cov( e.NCIn*e.BrdIn e.EDComp*e.BrdIn e.NEDCom*e.EDComp e.Foreign*e.EDComp e.Foreign*e.FemaleNED e.FCF*e.lev) nocapslatent from(b c, skip)
                        Assuming you reported the exact syntax you used, I still can't figure out what the parameter c1 would refer to. If you like, you can list the matrices you saved, e.g. (note the extra error covariance parameter in bold):

                        Code:
                        matrix list b
                        matrix list c
                        That will give you a potentially long list of coefficients that look cryptically named. For example,

                        Code:
                        use http://www.stata-press.com/data/r15/sem_hcfa1
                        sem (Phys -> phyab1 phyab2 phyab3 phyab4)
                        matrix phys = e(b)
                        sem (Appear -> appear1 appear2 appear3 appear4), cov(e.appear1*e.appear2)
                        mat appear = e(b)
                        
                        sem (Phys -> phyab1 phyab2 phyab3 phyab4) (Appear -> appear1 appear2 appear3 appear4), from(phys appear)
                        initial vector: extra parameter /cov(e.appear1,e.appear2) found
                        specify skip option if necessary
                        
                        sem (Phys -> phyab1 phyab2 phyab3 phyab4) (Appear -> appear1 appear2 appear3 appear4), from(phys appear, skip)
                        The last line is the correct line. The second-last line doesn't run and produces an error message, but the parameter name is informative, as you can see if you inspect the matrix involved:

                        Code:
                        mat list appear
                        
                        appear[1,14]
                                  appear1:       appear1:       appear2:       appear2:       appear3:
                                                                                                      
                                   Appear          _cons         Appear          _cons         Appear
                        y1              1           7.41      1.0491581              7      1.2595212
                        
                                  appear3:       appear4:       appear4:             /:             /:
                                                                                                      
                                    _cons         Appear          _cons  var(e.appe~1)  var(e.appe~2)
                        y1           7.17      1.0977995            7.4      2.7366053      3.7940695
                        
                                        /:             /:             /:             /:
                                                                         cov(e.appe~1,
                            var(e.appe~3)  var(e.appe~4)    var(Appear)     e.appear2)
                        y1      1.8153746      2.1791344      2.7171796      1.5290572
                        So, your list of parameters is going to be longer. You should be able to re-specify the sem command to skip the parameter (apologies for my error, there; I've never used the skip option!). If you need to inspect things to drop the unnecessary parameter, this is what you'll need to do. I suspect you may have run your program with an extra variable inserted somewhere, but I'm not certain (usually, a stray parameter would be prefixed by the equation name, or if it's a covariance parameter then there it's clearly marked as such). Hope this syntax works.
                        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

                        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                        Comment


                        • #13
                          Weiwen Ng
                          Thanks a lot indeed for the time and effort you put forward.
                          I am at a conference at the moment, I will try your recommendation and hopefully, it will work this time. Thanks a lot again.

                          Comment


                          • #14
                            Hello Statalist members:

                            A colleague and I are using Version 15 to run zinbcv regression models using the following syntax:

                            zinbcv init_count stable_dem1 stable_aut1 redem1 redem1_pautdur1 redem1_pdemdur1 growth prop_demsregion1 riots1, inflate(stable_dem1 stable_aut1 redem1 redem1_pautdur1 redem1_pdemdur1 growth prop_demsregion1 riots1) vuong nolog

                            When we execute the command, we receive the following error:

                            initial vector: extra parameter lnalpha:_cons found
                            specify skip option if necessary



                            Would appreciate any suggestions as to what this error means and how to address it.

                            Thank you!


                            Comment


                            • #15
                              John Sloan
                              Your post in #14 has no obvious connection to the topic of this thread. It is important to keep threads on topic because many people come to search the Forum looking for answers to questions that may have already been answered. If a person comes here searching for advice about goodness of fit for CFA, they will waste their time reading your post. If a person comes looking for advice about using -zinbcv-, they will not find it!

                              Also -zinbcv- is not part of official Stata. So in posting a question about it, it is helpful to explain what it is and where it comes from.

                              So please repost this as a new topic. Thank you.

                              Comment

                              Working...
                              X