Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fitting NL model with strictly positive dependent variable!

    I have estimated a model of the following form using NL:

    y=exp(xB)*(P^{gamma}) * (I^{theta}) + epsilon

    where y (expenditures) is strictly positive, x is a set of control vars, P is price, and I is income, epsilon~IID(0, sigma). My main problem is that the model generates negative predicted values, while gamma (price elasticity) and theta (income elasticity) have the right sign and magnitudes. I have also estimated this model using log-log transformation with OLS. Finally, I have tried Poisson and GLM [link(log) and family(gamma or Poisson)] but the sign of elasticities change and the estimated coefficients no longer make any sense.

    I really would like to estimate this model using MLE so that I could conduct likelihood ratio tests. In that case, I would specify the model properly as:

    1. y=exp(xB)*(P^{gamma}) * (I^{theta}) * epsilon, epsilon~IID(1, sigma) [Perhaps choosing a distribution for epsilon with strictly positive supports]

    OR

    2. y=exp(xB)*(P^{gamma}) * (I^{theta}) + epsilon, epsilon~IID-log-normal (0, sigma) [or some distribution for epsilon with strictly positive outcome and mean zero]

    I am new to MLE programming and have spent much time trying to implement (1) above. I would really appreciate any pointers or suggestions.

    Thank you!

    Cyrus

  • #2
    You can specify both an additive and a proportional error (residual) structure, and so you can have your cake and eat it, too. Google pharmacokinetics additive proportional error OR residual models and take a look at what pops up there in order to get some ideas how others have handled analogous situations in fitting nonlinear models of nonnegative or strictly positive responses.

    Comment


    • #3
      Dear Cyrus Ramezani,

      Can you please show us the results with Poisson regression? That would be my preferred approach and it is surprising that the results do not make sense.

      Best wishes,

      Joao

      Comment


      • #4
        Hi Joao

        Thank you for your reply.

        First I present the NL command results, where wrc is expenditure (all positive), to=price, capcinc=per capita income, prc=continuous positive and variables ending in jj all fractions (positive and negative). Just by way of background, I have read your work on "Log of Gravity" and related literature such as Manning and Mullahy on GGM.
        Thank you for your guidance.
        Cyrus

        Code:
        nl (wrc = (to^{to})*(capcinc^{capcinc})*(exp({prc}*prc+{spreadjj}*spreadjj+{vwerjj}*vwerjj \\
        +{vwstdjj}*vwstdjj))) if year_01> 1935, vce(robust)
        (obs = 81)
        (iterations removed)
        Iteration 9:  residual SS =  2.50e+11
        
        
        Nonlinear regression                                Number of obs =         81
                                                            R-squared     =     0.9816
                                                            Adj R-squared =     0.9801
                                                            Root MSE      =   57677.85
                                                            Res. dev.     =    1999.58
        
        ------------------------------------------------------------------------------
                     |               Robust
                 wrc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 /to |  -3.580236   .1469121   -24.37   0.000      -3.8729   -3.287572
            /capcinc |     .59312   .0320192    18.52   0.000     .5293346    .6569055
                /prc |   .0000428   3.40e-06    12.59   0.000      .000036    .0000495
           /spreadjj |  -8.173092   5.683937    -1.44   0.155    -19.49607    3.149892
             /vwerjj |   .1053026   .1753391     0.60   0.550    -.2439907    .4545959
            /vwstdjj |   1.042622   1.558082     0.67   0.505    -2.061236    4.146481
        ------------------------------------------------------------------------------
        Next please see the Poisson results. I am presenting results with "noconst" option. Adding an intercept still give similar results but shrinks "to" coefficient. I have also done this with ppml and glm, the coefficient have the wrong sign, but are similar to Poisson.

        Code:
        poisson wrc to capcinc prc spreadjj vwerjj vwstdjj if year_01> 1935, vce(robust) noconst
        note: you are responsible for interpretation of noncount dep. variable
        
        Iteration 5:   log pseudolikelihood = -2879818.2  
        Iteration 6:   log pseudolikelihood = -2879818.2  
        
        Poisson regression                              Number of obs     =         81
                                                        Wald chi2(6)      =   98459.29
        Log pseudolikelihood = -2879818.2               Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
                     |               Robust
                 wrc |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                  to |   71.42002   1.178627    60.60   0.000     69.10995    73.73008
             capcinc |  -.0000444   7.27e-06    -6.10   0.000    -.0000587   -.0000301
                 prc |   .0000391   9.72e-06     4.02   0.000       .00002    .0000581
            spreadjj |  -36.67065   10.49415    -3.49   0.000    -57.23881   -16.10249
              vwerjj |   -.410013   .4169201    -0.98   0.325    -1.227161    .4071354
             vwstdjj |   6.674432   3.426492     1.95   0.051    -.0413695    13.39023
        ------------------------------------------------------------------------------
        Last edited by Cyrus Ramezani; 07 Dec 2021, 12:59.

        Comment


        • #5
          Dear Cyrus Ramezani,

          Thanks for this. I believe you have numerical issues here. If your data were not strictly positive, I would say that you have perfect predictors, but with strictly positive data the problem must be something else. Please try estimating the model by Poisson regression using the user-contributed commands ppml and ppmlhdfe; also, please show us descriptive statistics of your dependent variable. Hopefully that will shed some light on the problem.

          Best wishes,

          Joao

          Comment


          • #6
            Dear Joao

            Yes, of course: First the descriptive stats:

            Code:
             
            mean sd skewness kurtosis min max
            wrc 302108.15 269620.80 0.94 2.45 3026.18 866227.12
            to 0.17 0.03 -0.18 1.60 0.12 0.21
            capcinc 18799.08 19367.32 0.98 2.85 605.00 71261.00
            prc 13463.70 12354.20 0.76 2.07 95.02 39068.38
            spreadjj 0.01 0.01 1.34 4.18 0.00 0.03
            vwerjj 0.08 0.16 -0.38 3.19 -0.35 0.45
            vwstdjj 0.04 0.02 1.74 7.72 0.02 0.13
            Next I present the PPML resutls (sorry for the image file. could not get things line up right): This is nearly identical to Poisson regression posted above.

            Finally the ppmlhdfe results:

            ======================= I have done some more looking around and still cannot figure it out. Looked into pharmacokinetics additive proportional error and there seem to be a possible solution using menl. Thank you!
            Attached Files
            Last edited by Cyrus Ramezani; 07 Dec 2021, 21:39.

            Comment


            • #7
              Originally posted by Cyrus Ramezani View Post
              Looked into pharmacokinetics additive proportional error and there seem to be a possible solution using menl. Thank you!
              Yeah, if you want to be able to do likelihood-ratio tests, then -menl- sans random effects terms is one way to go, but you can also code the model up fairly easily in -ml-.

              One question that I have from your original post, "My main problem is that the model generates negative predicted values . . ." How is that possible? With the shown ranges of to and capcinc, the model that you show in #4 using -nl- cannot give negative predictions for the fitted values. You did use -predict . . ., yhat- afterward, right?
              Last edited by Joseph Coveney; 07 Dec 2021, 23:46.

              Comment


              • #8
                Dear Joseph

                Thank you for follow up. I was mistaken when I said prediction is negative. I was looking at the residuals. My bad.

                Trying with menl now but cannot figure out how to include multiplicative error structure. Will likely go back and build MLE for the nonlinear form in my original post. I thought I could do likelihood ratio test with the "log-likelihood" output from "nl" but the residuals from the model were clearly not normally distributed (posted here). I have to figure out the proper distribution for y_i's in my model. I am likely back for guidance on coding ml.

                I really appreciate your pointers.

                Regards,

                Cyrus

                Comment


                • #9
                  Dear Cyrus Ramezani,

                  Thanks for posting those results. I believe that the issue you have is that all your variables are in very different scales and Stata really does not like that (hence the warnings in ppml). I suggest that you rescale your variables so that capcinc and prc are between 0 and 1, and your dependent variable is between 0 and 1000. Of course, you need an intercept if you rescale the dependent variable.

                  Best wishes,

                  Joao
                  Last edited by Joao Santos Silva; 08 Dec 2021, 03:00.

                  Comment


                  • #10
                    Originally posted by Cyrus Ramezani View Post
                    I thought I could do likelihood ratio test with the "log-likelihood" output from "nl" but the residuals from the model were clearly not normally distributed (posted here).
                    I don't want to put words into Clyde's mouth, but I think you might have been over-interpreting what he intended.

                    You can do likelihood-ratio testing of generalized linear models. Here's an example that doesn't have normally distributed residuals (and the least-squares analogue with -nl-, which has identical likelihood-ratio test results).

                    .ÿ
                    .ÿversionÿ17.0

                    .ÿ
                    .ÿclearÿ*

                    .ÿ
                    .ÿ//ÿseedem
                    .ÿsetÿseedÿ941453222

                    .ÿ
                    .ÿquietlyÿsetÿobsÿ250

                    .ÿ
                    .ÿgenerateÿdoubleÿpreÿ=ÿruniform()

                    .ÿ
                    .ÿgenerateÿdoubleÿoutÿ=ÿexp(rnormal())

                    .ÿ
                    .ÿglmÿoutÿc.pre,ÿfamily(gaussian)ÿlink(log)ÿnolog

                    GeneralizedÿlinearÿmodelsÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿ=ÿÿÿÿÿÿÿÿ250
                    Optimizationÿÿÿÿÿ:ÿMLÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿResidualÿdfÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ248
                    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿScaleÿparameterÿ=ÿÿÿ4.913129
                    Devianceÿÿÿÿÿÿÿÿÿ=ÿÿ1218.456026ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ(1/df)ÿDevianceÿ=ÿÿÿ4.913129
                    Pearsonÿÿÿÿÿÿÿÿÿÿ=ÿÿ1218.456026ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ(1/df)ÿPearsonÿÿ=ÿÿÿ4.913129

                    Varianceÿfunction:ÿV(u)ÿ=ÿ1ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ[Gaussian]
                    Linkÿfunctionÿÿÿÿ:ÿg(u)ÿ=ÿln(u)ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ[Log]

                    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿAICÿÿÿÿÿÿÿÿÿÿÿÿÿ=ÿÿÿ4.437756
                    Logÿlikelihoodÿÿÿ=ÿ-552.7194915ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿBICÿÿÿÿÿÿÿÿÿÿÿÿÿ=ÿÿ-150.8663

                    ------------------------------------------------------------------------------
                    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿOIM
                    ÿÿÿÿÿÿÿÿÿoutÿ|ÿCoefficientÿÿstd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
                    -------------+----------------------------------------------------------------
                    ÿÿÿÿÿÿÿÿÿpreÿ|ÿÿÿ-.184927ÿÿÿ.2792131ÿÿÿÿ-0.66ÿÿÿ0.508ÿÿÿÿ-.7321746ÿÿÿÿ.3623206
                    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ.6180901ÿÿÿ.1601571ÿÿÿÿÿ3.86ÿÿÿ0.000ÿÿÿÿÿ.3041879ÿÿÿÿ.9319924
                    ------------------------------------------------------------------------------

                    .ÿestimatesÿstoreÿGLM

                    .ÿ
                    .ÿpredictÿdoubleÿmu,ÿmu

                    .ÿgenerateÿdoubleÿresÿ=ÿoutÿ-ÿmu

                    .ÿ//ÿhistogramÿres
                    .ÿsummarizeÿres,ÿdetail

                    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿres
                    -------------------------------------------------------------
                    ÿÿÿÿÿÿPercentilesÿÿÿÿÿÿSmallest
                    ÿ1%ÿÿÿÿ-1.695699ÿÿÿÿÿÿ-1.749523
                    ÿ5%ÿÿÿÿ-1.529229ÿÿÿÿÿÿ-1.742811
                    10%ÿÿÿÿ-1.423718ÿÿÿÿÿÿ-1.695699ÿÿÿÿÿÿÿObsÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ250
                    25%ÿÿÿÿ-1.184809ÿÿÿÿÿÿ-1.638306ÿÿÿÿÿÿÿSumÿofÿwgt.ÿÿÿÿÿÿÿÿÿ250

                    50%ÿÿÿÿ-.7662971ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿMeanÿÿÿÿÿÿÿÿÿÿ-.0000526
                    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿLargestÿÿÿÿÿÿÿStd.ÿdev.ÿÿÿÿÿÿ2.212103
                    75%ÿÿÿÿÿ.2943225ÿÿÿÿÿÿÿ7.602373
                    90%ÿÿÿÿÿ1.980493ÿÿÿÿÿÿÿ8.217633ÿÿÿÿÿÿÿVarianceÿÿÿÿÿÿÿ4.893398
                    95%ÿÿÿÿÿ4.703822ÿÿÿÿÿÿÿ8.427274ÿÿÿÿÿÿÿSkewnessÿÿÿÿÿÿÿ3.915262
                    99%ÿÿÿÿÿ8.217633ÿÿÿÿÿÿÿ18.84187ÿÿÿÿÿÿÿKurtosisÿÿÿÿÿÿÿ25.99057

                    .ÿ
                    .ÿquietlyÿglmÿout,ÿfamily(gaussian)ÿlink(log)ÿnolog

                    .ÿlrtestÿGLM

                    Likelihood-ratioÿtest
                    Assumption:ÿ.ÿnestedÿwithinÿGLM

                    ÿLRÿchi2(1)ÿ=ÿÿÿ0.44
                    Probÿ>ÿchi2ÿ=ÿ0.5078

                    .ÿ
                    .ÿnlÿ(outÿ=ÿexp({x}ÿ*ÿpreÿ+ÿ{cons})),ÿnolog


                    ÿÿÿÿÿÿSourceÿ|ÿÿÿÿÿÿSSÿÿÿÿÿÿÿÿÿÿÿÿdfÿÿÿÿÿÿÿMS
                    -------------+----------------------------------ÿÿÿÿNumberÿofÿobsÿ=ÿÿÿÿÿÿÿÿ250
                    ÿÿÿÿÿÿÿModelÿ|ÿÿ713.57916ÿÿÿÿÿÿÿÿÿÿ2ÿÿ356.789582ÿÿÿÿR-squaredÿÿÿÿÿ=ÿÿÿÿÿ0.3693
                    ÿÿÿÿResidualÿ|ÿÿÿ1218.456ÿÿÿÿÿÿÿÿ248ÿÿ4.91312914ÿÿÿÿAdjÿR-squaredÿ=ÿÿÿÿÿ0.3643
                    -------------+----------------------------------ÿÿÿÿRootÿMSEÿÿÿÿÿÿ=ÿÿÿ2.216558
                    ÿÿÿÿÿÿÿTotalÿ|ÿÿ1932.0352ÿÿÿÿÿÿÿÿ250ÿÿ7.72814076ÿÿÿÿRes.ÿdev.ÿÿÿÿÿ=ÿÿÿ1105.439

                    ------------------------------------------------------------------------------
                    ÿÿÿÿÿÿÿÿÿoutÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
                    -------------+----------------------------------------------------------------
                    ÿÿÿÿÿÿÿÿÿÿ/xÿ|ÿÿ-.1849271ÿÿÿ.2822376ÿÿÿÿ-0.66ÿÿÿ0.513ÿÿÿÿ-.7408153ÿÿÿÿ.3709611
                    ÿÿÿÿÿÿÿ/consÿ|ÿÿÿ.6180901ÿÿÿ.1614281ÿÿÿÿÿ3.83ÿÿÿ0.000ÿÿÿÿÿ.3001453ÿÿÿÿÿ.936035
                    ------------------------------------------------------------------------------

                    .ÿestimatesÿstoreÿNL

                    .ÿ
                    .ÿpredictÿdoubleÿyhat,ÿyhat

                    .ÿ
                    .ÿquietlyÿnlÿ(outÿ=ÿexp({cons})),ÿnolog

                    .ÿlrtestÿNL

                    Likelihood-ratioÿtest
                    Assumption:ÿ.ÿnestedÿwithinÿNL

                    ÿLRÿchi2(1)ÿ=ÿÿÿ0.44
                    Probÿ>ÿchi2ÿ=ÿ0.5078

                    .ÿ
                    .ÿexit

                    endÿofÿdo-file


                    .


                    But it seems that your particular case is a well known econometric model, and so I'm guessing that you'll be progressing faster by following what Joao is suggesting.

                    Comment


                    • #11
                      Thank you both for your kind assistance. This has been a wonderful learning experience for me, coming nearly 30 years after my last Econometrics course. A great way to get back into empirical work after 12 years in admin. I will follow your recommendations and see this through.

                      Best wishes for a great holiday break.

                      Regards.

                      Comment

                      Working...
                      X