Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with gsem: Recursive Bivariate/Multivariate Ordered Probit

    Hello,

    I want to run a recursive bivariate ordered probit, and I have difficulty using gsem. I couldn't find any other Stata syntax that can do that. I chose gsem because later, I need to run some specifications using multivariate ordered probit, and I know gsem can be modified to include more than two equations.

    The problem that I think the gsem result is not correct is that the estimated results of gsem are identical to two single equations ordered probit. The estimated thresholds of gsem and two single equation ordered probit (Mu) are also identical. I think the problem is that the gsem is ignoring the correlation between two equations, and latent in the first equation are estimated independently of latent in the second equation, or I may not use the gsem correctly. I carefully looked at the Stata's gsem manual, but I couldn't find any similar examples or explanations.

    Please see the code below (z is matrix of exogenous instruments):

    Code:
    #delimit  ;
     *Single equations:  oprobit y1 $demo ln_income $location_time y2, vce(cl fips);  oprobit y2 $demo ln_income $location_time $z , vce(cl fips);  *ordered probit in gsem with probit link:  gsem (y1 $demo ln_income $location_time y2)(y2 $demo ln_income $location_time $z) , family(ordinal) link(probit) vce(cl fips);

    I also tried the bioprobit syntax. But I don't think this will correct for the endogeneity in my model. Please correct me if I am wrong. The estimated coefficient of y2 in bioprobit is five times larger than the estimated coefficient of y2 in gsem. Below please see the code:


    Code:
    #delimit  ;
     bioprobit (y1 $demo  ln_income  $location_time y2) (y2  $demo ln_income $location_time $z), vce(cl fips);
    I appreciate any thoughts on this problem.

    Thank you!

    Mona
    Last edited by Mona Ahmadiani; 13 Apr 2020, 10:06.

  • #2
    You can try something like the following for bivariate ordered-probit regression with repeated measurements, which you appear to have with the FIPS variable. (I guess that's what you mean by "correct for the endogeneity".)

    .ÿ
    .ÿversionÿ16.1

    .ÿ
    .ÿclearÿ*

    .ÿ
    .ÿsetÿseedÿ`=strreverse("1546455")'

    .ÿ
    .ÿquietlyÿsetÿobsÿ250

    .ÿgenerateÿintÿpidÿ=ÿ_n

    .ÿgenerateÿdoubleÿpid_uÿ=ÿrnormal()

    .ÿ
    .ÿquietlyÿexpandÿ10

    .ÿbysortÿpid:ÿgenerateÿbyteÿtimÿ=ÿ_n

    .ÿ
    .ÿdrawnormÿlat0ÿlat1,ÿdoubleÿcorr(1ÿ0.5ÿ\ÿ0.5ÿ1)

    .ÿ
    .ÿlocalÿcut_list

    .ÿforvaluesÿcutÿ=ÿ0.2(0.2)0.8ÿ{
    ÿÿ2.ÿÿÿÿÿlocalÿcut_listÿ`cut_list'ÿ`=invnormal(`cut')'
    ÿÿ3.ÿ}

    .ÿ
    .ÿforvaluesÿiÿ=ÿ0/1ÿ{
    ÿÿ2.ÿÿÿÿÿquietlyÿ{
    ÿÿ3.ÿÿÿÿÿÿÿÿÿÿÿÿÿreplaceÿlat`i'ÿ=ÿpid_uÿ+ÿlat`i'
    ÿÿ4.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿreplaceÿlat`i'ÿ=ÿlat`i'ÿ/ÿsqrt(2)
    ÿÿ5.ÿÿÿÿÿÿÿÿÿ}
    ÿÿ6.ÿÿÿÿÿÿÿÿÿgrologitÿlat`i',ÿgenerate(man`i')ÿprobitÿcuts("`cut_list'")
    ÿÿ7.ÿ}

    .ÿ
    .ÿ*
    .ÿ*ÿBeginÿhere
    .ÿ*
    .ÿgsemÿ(man0ÿ<-ÿi.timÿM[pid])ÿ(man1ÿ<-ÿi.timÿM[pid])ÿ///
    >ÿÿÿÿÿ(man0@1ÿ<-ÿF1)ÿ(man1@1ÿ<-ÿF2),ÿfamily(ordinal)ÿlink(probit)ÿ///
    >ÿÿÿÿÿvariance(F1@1ÿF2@1)ÿcovariance(F1*F2)ÿnocnsreportÿnodvheaderÿnolog

    GeneralizedÿstructuralÿequationÿmodelÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿ2,500
    Logÿlikelihoodÿ=ÿ-7317.0567

    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
    -------------+----------------------------------------------------------------
    man0ÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿÿtimÿ|
    ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿÿ.1214346ÿÿÿ.1401239ÿÿÿÿÿ0.87ÿÿÿ0.386ÿÿÿÿ-.1532032ÿÿÿÿ.3960723
    ÿÿÿÿÿÿÿÿÿÿ3ÿÿ|ÿÿ-.1338207ÿÿÿ.1411876ÿÿÿÿ-0.95ÿÿÿ0.343ÿÿÿÿ-.4105433ÿÿÿÿÿ.142902
    ÿÿÿÿÿÿÿÿÿÿ4ÿÿ|ÿÿ-.0639536ÿÿÿ.1395183ÿÿÿÿ-0.46ÿÿÿ0.647ÿÿÿÿ-.3374045ÿÿÿÿ.2094973
    ÿÿÿÿÿÿÿÿÿÿ5ÿÿ|ÿÿ-.0699368ÿÿÿÿ.141053ÿÿÿÿ-0.50ÿÿÿ0.620ÿÿÿÿ-.3463957ÿÿÿÿ.2065221
    ÿÿÿÿÿÿÿÿÿÿ6ÿÿ|ÿÿÿ.0481062ÿÿÿ.1404658ÿÿÿÿÿ0.34ÿÿÿ0.732ÿÿÿÿ-.2272018ÿÿÿÿ.3234142
    ÿÿÿÿÿÿÿÿÿÿ7ÿÿ|ÿÿ-.0294386ÿÿÿ.1408498ÿÿÿÿ-0.21ÿÿÿ0.834ÿÿÿÿ-.3054991ÿÿÿÿ.2466218
    ÿÿÿÿÿÿÿÿÿÿ8ÿÿ|ÿÿ-.1705645ÿÿÿ.1401445ÿÿÿÿ-1.22ÿÿÿ0.224ÿÿÿÿ-.4452427ÿÿÿÿ.1041136
    ÿÿÿÿÿÿÿÿÿÿ9ÿÿ|ÿÿÿ.1174838ÿÿÿ.1404209ÿÿÿÿÿ0.84ÿÿÿ0.403ÿÿÿÿ-.1577362ÿÿÿÿ.3927037
    ÿÿÿÿÿÿÿÿÿ10ÿÿ|ÿÿ-.0833363ÿÿÿ.1410826ÿÿÿÿ-0.59ÿÿÿ0.555ÿÿÿÿ-.3598531ÿÿÿÿ.1931805
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿM[pid]ÿ|ÿÿÿÿÿÿÿÿÿÿ1ÿÿ(constrained)
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿÿÿF1ÿ|ÿÿÿÿÿÿÿÿÿÿ1ÿÿ(constrained)
    -------------+----------------------------------------------------------------
    man1ÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿÿtimÿ|
    ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿÿ.0184034ÿÿÿÿ.140498ÿÿÿÿÿ0.13ÿÿÿ0.896ÿÿÿÿ-.2569676ÿÿÿÿ.2937744
    ÿÿÿÿÿÿÿÿÿÿ3ÿÿ|ÿÿ-.2100669ÿÿÿ.1402554ÿÿÿÿ-1.50ÿÿÿ0.134ÿÿÿÿ-.4849625ÿÿÿÿ.0648286
    ÿÿÿÿÿÿÿÿÿÿ4ÿÿ|ÿÿ-.0773419ÿÿÿ.1392505ÿÿÿÿ-0.56ÿÿÿ0.579ÿÿÿÿ-.3502678ÿÿÿÿ.1955841
    ÿÿÿÿÿÿÿÿÿÿ5ÿÿ|ÿÿ-.1171556ÿÿÿÿ.140923ÿÿÿÿ-0.83ÿÿÿ0.406ÿÿÿÿ-.3933596ÿÿÿÿ.1590484
    ÿÿÿÿÿÿÿÿÿÿ6ÿÿ|ÿÿÿÿ.090175ÿÿÿ.1404513ÿÿÿÿÿ0.64ÿÿÿ0.521ÿÿÿÿ-.1851044ÿÿÿÿ.3654545
    ÿÿÿÿÿÿÿÿÿÿ7ÿÿ|ÿÿ-.0722897ÿÿÿ.1401778ÿÿÿÿ-0.52ÿÿÿ0.606ÿÿÿÿ-.3470331ÿÿÿÿ.2024536
    ÿÿÿÿÿÿÿÿÿÿ8ÿÿ|ÿÿ-.1500516ÿÿÿ.1405072ÿÿÿÿ-1.07ÿÿÿ0.286ÿÿÿÿ-.4254406ÿÿÿÿ.1253374
    ÿÿÿÿÿÿÿÿÿÿ9ÿÿ|ÿÿ-.2475204ÿÿÿ.1405449ÿÿÿÿ-1.76ÿÿÿ0.078ÿÿÿÿ-.5229834ÿÿÿÿ.0279425
    ÿÿÿÿÿÿÿÿÿ10ÿÿ|ÿÿ-.1668872ÿÿÿ.1404469ÿÿÿÿ-1.19ÿÿÿ0.235ÿÿÿÿ-.4421581ÿÿÿÿ.1083837
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿM[pid]ÿ|ÿÿÿ1.029481ÿÿÿ.0608931ÿÿÿÿ16.91ÿÿÿ0.000ÿÿÿÿÿ.9101329ÿÿÿÿ1.148829
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿÿÿF2ÿ|ÿÿÿÿÿÿÿÿÿÿ1ÿÿ(constrained)
    -------------+----------------------------------------------------------------
    /man0ÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿcut1ÿ|ÿÿ-.9942727ÿÿÿ.1170162ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ-1.22362ÿÿÿ-.7649252
    ÿÿÿÿÿÿÿÿcut2ÿ|ÿÿ-.2772879ÿÿÿ.1159863ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ-.5046168ÿÿÿÿ-.049959
    ÿÿÿÿÿÿÿÿcut3ÿ|ÿÿÿ.3080814ÿÿÿ.1160383ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ.0806505ÿÿÿÿ.5355123
    ÿÿÿÿÿÿÿÿcut4ÿ|ÿÿÿ.9660462ÿÿÿ.1169811ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ.7367674ÿÿÿÿ1.195325
    -------------+----------------------------------------------------------------
    /man1ÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿcut1ÿ|ÿÿ-1.104132ÿÿÿ.1180197ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ-1.335447ÿÿÿ-.8728178
    ÿÿÿÿÿÿÿÿcut2ÿ|ÿÿ-.3698098ÿÿÿ.1166205ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ-.5983817ÿÿÿ-.1412379
    ÿÿÿÿÿÿÿÿcut3ÿ|ÿÿÿ.2405459ÿÿÿ.1164729ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ.0122632ÿÿÿÿ.4688285
    ÿÿÿÿÿÿÿÿcut4ÿ|ÿÿÿ.9129324ÿÿÿ.1175183ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ.6826008ÿÿÿÿ1.143264
    -------------+----------------------------------------------------------------
    ÿÿvar(M[pid])|ÿÿÿ.7687688ÿÿÿÿ.095974ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ.6019088ÿÿÿÿ.9818854
    ÿÿÿÿÿÿvar(F1)|ÿÿÿÿÿÿÿÿÿÿ1ÿÿ(constrained)
    ÿÿÿÿÿÿvar(F2)|ÿÿÿÿÿÿÿÿÿÿ1ÿÿ(constrained)
    -------------+----------------------------------------------------------------
    ÿÿÿcov(F1,F2)|ÿÿÿÿ.303265ÿÿÿ.0492673ÿÿÿÿÿ6.16ÿÿÿ0.000ÿÿÿÿÿÿ.206703ÿÿÿÿ.3998271
    ------------------------------------------------------------------------------

    .ÿ
    .ÿexit

    endÿofÿdo-file


    .


    Keep in mind that the regression coefficients, including the variance of the random effect and the covariance of the two latent factors, are liable to be off by a factor of sqrt(2) for reasons referred to in this post. For the multivariate case, you'll probably want to go with Laplace integration.

    Also, I suspect that these models will become arduous to fit as you increase their complexity. You might want to just go with suest, which is about what you did in your gsem example.
    Attached Files

    Comment


    • #3
      Hi Joseph,

      Thank you very much for your response and elaborate explanation. To clarify some points, my data structure is a cross-section (just one year), and each individual is observed only once. The FIPS are the state dummies. So within each state, we have multiple individuals. I can still use the ordered probit with repeated measurement. Please correct me if I am wrong.

      The endogenous variable here is y2, which is being identified by introducing an exogenous matrix of z in the second equation, which is also excluded from the first equation.

      I run what you have suggested with a slight modification of just adding the man1(y2) to man0(y1) equation. Please see below:

      Code:
      gsem (y1<- $demo  ln_income  y2  M[fips]) ///
             (y2<- $demo ln_income z  M[fips] )  ///
            (y1@1 <- F1) (y2@1 <- F2), ///
            ,variance(F1@1 F2@1) cov(F1*F2) family(ordinal) link(probit)
      This code gives an error that both variance and covariance are invalid. I think it is because F2 can not be constrained when we include y2 in the first equation. Am I right?

      The following specification did work, but it has not converged yet.

      Code:
      gsem (y1<- $demo  ln_income  y2  M[fips] F1) ///
             (y2<- $demo ln_income z  M[fips] F2@1)  ///
            , var(F1@1 F2@1) cov(F1*F2) family(ordinal) link(probit)
      I have a few questions, and I appreciate if you can help me with that:

      1) Is there any way I can simplify this model to make it converge? The iterations are not changing, and I can say the model is not identified.
      2) Considering the restriction that we imposed on the variance of F1 and F2, Is cov (F1*F2) the correlation coefficient of two equations? So if I am estimating this correctly, the cov(F1*F2) should be equal for the estimated rho in bivariate ordered probit. Is that right?


      I appreciate your help,

      Mona

      Joseph Coveney


      Last edited by Mona Ahmadiani; 14 Apr 2020, 09:59.

      Comment


      • #4
        I didn't see that the outcome variable in the second equation, y2, is used as a predictor in the first. I'm not familiar with such endogeneity correction (not used in my field); nevertheless, I wouldn't have considered this to be a bivariate ordered-probit regression model, but rather a more straightforward path analysis model, and so I would not have thought to introduce the two latent factors or their covariance.

        As far as your questions, for #1, for reasons just given, I suggest getting rid of F1, F2 and their covariance term, and for #2, as I mentioned earlier, in a bivariate ordered-probit regression model the square root of the F1·F2 covariance term ought to be proportional to rho, but I would not expect them to equal each other.

        Comment


        • #5
          Joseph, thank you for your response and clarification.

          Comment

          Working...
          X