Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Meta-Frontier Analysis in Stata: Estimation

    Does anyone know how to estimate a Meta-Frontier Production Function in Stata? Any syntax ideas would help. The only work I have seen so far about Meta-frontier analysis has been in Shazam Software. Thanks.

  • #2
    You should have a look at Huang et al. (J Prod Anal 42:241–254, 2014) "A new approach to estimating the metafrontier production function based on a stochastic frontier framework" which criticizes Battese et al. (J Prod Anal 21:91–103, 2004) and O’Donnell et al. (Empir Econ 34:231–255, 2008) procedure of using linear (quadratic) programming techniques in the second step. Their suggested procedure is easily implemented in Stata. Given the output "y", 3 inputs "x1", "x2", and "x3" and environmental variables z1-z8 and assuming a translog functional form, first generate dummies for the groups reflecting different technology possibility sets, e.g., group1, group2, group3. The ensuing syntax is

    Code:
    frontier lny c.lnx1##c.lnx1 c.lnx2##c.lnx2 c.lnx3##c.lnx3 c.lnx1#c.lnx2 c.lnx1#c.lnx3 ///
    c.lnx2#c.lnx3 if group1, distribution(tnormal) cm(z1-z4) ///
    predict te1 if group1, te
    predict xb1 if group1, xb
    This is the Battese and Coelli's (1992) frontier model for panel data implemented by Stata's -frontier- command. Repeat the same for the rest of the groups. In the second step, use the predicted values as the outcome for the metafrontier equation

    Code:
    gen pred= .
    replace pred= xb1 if group1
    replace pred= xb2 if group2
    replace pred= xb3 if group3
    
    
    frontier pred c.lnx1##c.lnx1 c.lnx2##c.lnx2 c.lnx3##c.lnx3 c.lnx1#c.lnx2 c.lnx1#c.lnx3 c.lnx2#c.lnx3, distribution(tnormal) cm(z5-z8)
    predict te_meta, te

    Note that the environmental variables in the second stage differ from those in the first. Read Huang et al. for reasons why you need to include environmental variables.

    https://link.springer.com/article/10...123-014-0402-2
    Last edited by Andrew Musau; 19 May 2017, 08:24.

    Comment


    • #3
      Thanks a lot Andrew Musau. I will check on work by Huang et al. and will also incorporate your guide and see how it works out. Truly appreciate!!!
      Last edited by John Ngombe; 20 May 2017, 17:32.

      Comment


      • #4
        Andrew Musau,

        I tried to replicate Huang's methodology using a primary database (a cross-section with 886 observations), but the estimated results were different than expected, with the technical efficiency in the first step smaller than the one estimated in the second stage (meta-frontier). Is it possible that the low R² of the first stage estimation (close to 0.3) might be generating this problem in the second stage efficiency (by way of the variable pred)? Thank you!

        Comment


        • #5
          Pedro: If you followed exactly the translog syntax in #2, it is not entirely correct because it neglects to multiply the squared terms by 0.5. If not, it is virtually impossible to advice but here are a few things to check for before estimation

          1) Run OLS on your model and check that the residuals have the right skewness (left for a production function, right for cost)
          2) Check which functional form is preferred (translog or a reduction to Cobb Douglas)
          3) If cost function, test for cost function properties (monotonicity, concavity, etc.)

          Once you are OK with these, then you know that your results will make sense. Otherwise, you may just have data problems. The correct syntax for estimating a translog function involves a few more steps which I present below. Given the model



          $$
          ln y_{i}= a_{0} + a_{1}ln x1_{i}+ a_{2}ln x2_{i}+a_{3}ln x3_{i}+ \frac{1}{2}a_{11}(ln x1_{ i})^{2}\\
          +\frac{1}{2}a_{22}(ln x2_{ i})^{2}+\frac{1}{2}a_{33}(ln x3_{ i})^{2}+a_{12}ln x1_{i}*lnx2_{i}\\
          +a_{13}ln x1_{i}*lnx3_{i}+a_{23}ln x2_{i}*lnx3_{i }+ u_{i}- v_{i}
          $$


          Code:
          *\\ TAKE THE LOGS OF YOUR VARIABLES (NOT OF THE ENVIRONMENTAL VARIABLES!)
          foreach var in y x1 x2 x3{
          gen double l`var'= ln(`var')
          }
          
          *\\ MULTIPLY SQUARED TERMS BY 0.5
          foreach var in x1 x2 x3{
          gen double l`var's =  0.5*(l`var')^2
          }
          
          *\\ RUN THE MODEL
          
          frontier ly lx1 lx2 lx3 lx1s lx2s lx3s c.lx1#c.lx2 c.lx1#c.lx3 c.lx2#c.lx3\\
          if group1, distribution(tnormal) cm(z1-z4)

          Comment


          • #6
            Andrew,

            First, thank you for the reply! I tested the production function in the translog, modified translog (following Coelli et al (2003)) and Cobb-Douglas specifications, but for all estimates I observed the same problem. The statistical tests always indicated left skewness, as expected. So, everything leads me to believe that there is some consistency problem in the database.

            Comment


            • #7
              Hello Andrew,

              Thanks for your post. Please I would like to implement the Huang et al. (2014) approach using a cost function for a panel data following the model (with U+V) you specified in #5

              I have specified the code following your approach:

              sfpanel ly lx1 lx2 lx3 lx4 c.lx1#c.lx2 c.lx1#c.lx3 c.lx1#c.lx4 c.lx2#c.lx3 c.lx2#c.lx4 c.lx3#c.lx4 lx1s lx2s lx3s lx4s if group1, model (bc92) dist(tn) cost
              predict ce1 if group1, bc
              predict xb1 if group1, xb

              and so on for all groups.

              1. My understanding from the paper is that the estimated te_meta as you predicted above is the TGR which you multiply with the group specific TE in this case cost efficiency (ce_group) to achieve the meta cost efficiency (ce_meta). Am I wrong?

              2. My problem is also that the sfpanel (bc92) does not allow to include the group specific and meta environmental variables using the emean or cmean. Using the 'frontier' as you indicated above or "xfrontier' does not achieve convergence. sfpanel (bc95) approach is not able to fit my model as it gives missing results.

              3. Can I include the environmental variables in the level equation for sfpanel(bc92)? when I do so I'm able to estimate the scores. However, when I summarize the descriptive statistics, ce_meta is not equal to TGR*ce_group

              I will be grateful for your advise.

              Thank you.




              Comment


              • #8
                sfpanel is from SSC.The default when estimating the conditional mean model in frontier is bc95. Sorry for my error in #2, the conditional mean model is not allowed for model(bc92). Therefore, change the model to bc95 in sfpanel and include the environmental variables using the -emean()- option. It is common to have convergence problems when fitting these models and like other maximum likelihood estimations, there are no guarantees, but you can estimate a simpler model first and use the fitted values as starting values for your model with convergence problems. Sometimes this helps. Yes, the predicted values for the groups are the outcome when estimating the metafrontier.
                Last edited by Andrew Musau; 09 Dec 2020, 17:26.

                Comment


                • #9
                  Hello Andrews, Thanks a lot for your quick response and clarification. I get it now. I am appreciative.

                  Comment


                  • #10
                    Dear Andrew
                    I have run the following command several time and got the following out come. In the sam time i used the FRONTIER4.1 and got a reasonable output. Can you tell me why i am not having the same output in SFPanel.

                    This is the output i got in FRONT4.1
                    coefficient standard-error t-ratio
                    Beta_0 - 1.023 0.525 - 1.947
                    LOAN Beta_1 3.680 0.574 6.409
                    OTAST Beta_2 - 0.748 0.309 - 2.422
                    DEPO Beta_3 - 1.647 0.563 - 2.927
                    P1 Beta_4 1.748 0.182 9.624
                    P2 Beta_5 0.025 0.120 0.211
                    TREND Beta_6 - 0.090 0.002 - 55.374
                    Q1_2 Beta_7 3.018 0.379 7.952
                    Q1Q2 Beta_8 0.056 0.166 0.338
                    Q1Q3 Beta_9 - 3.263 0.223 - 14.603
                    Q2_2 Beta_10 0.287 0.108 2.665
                    Q2Q3 Beta_11 - 0.391 0.048 - 8.129
                    Q3_2 Beta_12 3.878 0.154 25.195
                    P1_2 Beta_13 0.172 0.023 7.569
                    P1P2 Beta_14 - 0.161 0.021 - 7.557
                    P2_1 Beta_15 0.088 0.016 5.369
                    Q1P1 Beta_16 - 0.266 0.079 - 3.341
                    Q1P2 Beta_17 - 0.097 0.064 - 1.515
                    Q2P1 Beta_18 - 0.166 0.057 - 2.894
                    Q2P2 Beta_19 0.155 0.045 3.421
                    Q3P1 Beta_20 0.433 0.085 5.081
                    Q3P2 Beta_21 - 0.086 0.054 - 1.602
                    Delta_0 - 10.289 2.233 - 4.609
                    LN_TA Delta_1 0.465 0.069 6.709
                    ZSCORE Delta_2 0.014 0.001 19.683
                    TCAPR Delta_3 3.017 0.417 7.236
                    ROA Delta_4 - 0.317 0.989 - 0.321
                    MKTSH Delta_5 - 4.204 0.703 - 5.978
                    sigma-squared 0.040 0.002 16.973
                    gamma 1.000 0.001 1,241.127
                    This is the one i got in SFPanel

                    sfpanel TCOST LOAN OTAST DEPO P1 P2 K Q1_2 Q1Q2 Q1Q3 Q2_2 Q2Q3 Q3_2 P1_2 P1P2 P2_1 Q1P1 Q1P2 Q2P1 Q2P2 Q3P1 Q3P2, m(bc92) d(tn) emean(LN_TA ZSCORE TCAPR ROA MKTSH) cost

                    Inefficiency effects model (truncated-normal) Number of obs = 85
                    Group variable: DMU Number of groups = 11
                    Time variable: YEAR Obs per group: min = 6
                    avg = 7.7
                    max = 8

                    Prob > chi2 = .
                    Log likelihood = -8586.3159 Wald chi2(0) = .

                    ------------------------------------------------------------------------------
                    TCOST | Coef. Std. Err. z P>|z| [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                    Frontier |
                    LOAN | -4.226583 . . . . .
                    OTAST | -9.411944 . . . . .
                    DEPO | -13.45925 . . . . .
                    P1 | 35.23126 . . . . .
                    P2 | 79.8152 . . . . .
                    K | 29.99498 . . . . .
                    Q1_2 | 16.07713 . . . . .
                    Q1Q2 | 36.79187 . . . . .
                    Q1Q3 | 28.31715 . . . . .
                    Q2_2 | 21.82485 . . . . .
                    Q2Q3 | 35.16767 . . . . .
                    Q3_2 | 17.75145 . . . . .
                    P1_2 | 100.8052 . . . . .
                    P1P2 | 445.8224 . . . . .
                    P2_1 | 506.8234 . . . . .
                    Q1P1 | -57.43169 . . . . .
                    Q1P2 | -113.1178 . . . . .
                    Q2P1 | -72.09528 . . . . .
                    Q2P2 | -149.8962 . . . . .
                    Q3P1 | -55.12123 . . . . .
                    Q3P2 | -111.6336 . . . . .
                    _cons | 7.18826 . . . . .
                    -------------+----------------------------------------------------------------
                    Mu |
                    LN_TA | -112.6402 . . . . .
                    ZSCORE | -673.9753 . . . . .
                    TCAPR | .1532187 . . . . .
                    ROA | .9564855 . . . . .
                    MKTSH | .510407 . . . . .
                    _cons | -5.320933 . . . . .
                    -------------+----------------------------------------------------------------
                    Usigma |
                    _cons | 199.4999 . . . . .
                    -------------+----------------------------------------------------------------
                    Vsigma |
                    _cons | 199.4999 . . . . .
                    -------------+----------------------------------------------------------------
                    sigma_u | 2.09e+43 . . . . .
                    sigma_v | 2.09e+43 . . . . .
                    lambda | 1 . . . . .
                    ------------------------------------------------------------------------------
                    Do you have any idea?

                    Comment


                    • #11
                      I assume that you are referring to Tim Coell's program -frontier- that is written in R. sfpanel is from SSC. As I do not use the former, I cannot advise on why you are getting different results. Also, you do not specify what model you are estimating in -frontier- as it can estimate a wide range of stochastic frontier models. From your sfpanel command, your syntax corresponds to the conditional mean model. However, it appears that you are not taking the logs of your variables before estimating. Note that sfpanel does no transformations of the data, so you must do this yourself.

                      Comment


                      • #12
                        Dear Andrew

                        Thanks for the prompt reply. I use the st0315. Since, i am using trans-log cost function with three outputs and three input prices, all the data have transformed in to log. I want to estimate BC95 model which permit the estimation of a model for determinants of inefficiency. I thought emean() .provides the ability to estimate the impact of the determinants of inefficiency. Am i using the correct syntax?

                        Comment


                        • #13
                          Take a look at the Stata Journal article that introduces sfpanel and the textbook by Kumbhakar, Wang and Horncastle. If you have input prices, depending on what assumptions you impose, e.g., linear homogeneity, then there are some steps required before estimation. You will find examples on how to do this in Stata.

                          Comment


                          • #14
                            I tried to estimate the meta-frontier with Huang et al (2014) for BC92 and BC95 models. However, it came with these errors: 1.not concave 2. backed up 3.could not calculate numerical derivatives - flat or discontinuous region. Could anyone give suggestion? Thank you.

                            Comment


                            • #15
                              Originally posted by Andrew Musau View Post
                              You should have a look at Huang et al. (J Prod Anal 42:241–254, 2014) "A new approach to estimating the metafrontier production function based on a stochastic frontier framework" which criticizes Battese et al. (J Prod Anal 21:91–103, 2004) and O’Donnell et al. (Empir Econ 34:231–255, 2008) procedure of using linear (quadratic) programming techniques in the second step. Their suggested procedure is easily implemented in Stata. Given the output "y", 3 inputs "x1", "x2", and "x3" and environmental variables z1-z8 and assuming a translog functional form, first generate dummies for the groups reflecting different technology possibility sets, e.g., group1, group2, group3. The ensuing syntax is

                              Code:
                              frontier lny c.lnx1##c.lnx1 c.lnx2##c.lnx2 c.lnx3##c.lnx3 c.lnx1#c.lnx2 c.lnx1#c.lnx3 ///
                              c.lnx2#c.lnx3 if group1, distribution(tnormal) cm(z1-z4) ///
                              predict te1 if group1, te
                              predict xb1 if group1, xb
                              This is the Battese and Coelli's (1992) frontier model for panel data implemented by Stata's -frontier- command. Repeat the same for the rest of the groups. In the second step, use the predicted values as the outcome for the metafrontier equation

                              Code:
                              gen pred= .
                              replace pred= xb1 if group1
                              replace pred= xb2 if group2
                              replace pred= xb3 if group3
                              
                              
                              frontier pred c.lnx1##c.lnx1 c.lnx2##c.lnx2 c.lnx3##c.lnx3 c.lnx1#c.lnx2 c.lnx1#c.lnx3 c.lnx2#c.lnx3, distribution(tnormal) cm(z5-z8)
                              predict te_meta, te

                              Note that the environmental variables in the second stage differ from those in the first. Read Huang et al. for reasons why you need to include environmental variables.

                              https://link.springer.com/article/10...123-014-0402-2
                              Dear Andrew Musau
                              How can I estimate elastcity of each input with respect to dependent varaible.Can you please give me the codes that I can use for the estimation of cost elasticity with respect to output

                              Comment

                              Working...
                              X