Meta-Frontier Analysis in Stata: Estimation

John Ngombe

Join Date: May 2016

Posts: 13
#1

Meta-Frontier Analysis in Stata: Estimation

17 May 2017, 11:27

Does anyone know how to estimate a Meta-Frontier Production Function in Stata? Any syntax ideas would help. The only work I have seen so far about Meta-frontier analysis has been in Shazam Software. Thanks.
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10089
#2

19 May 2017, 07:18

You should have a look at Huang et al. (J Prod Anal 42:241–254, 2014) "A new approach to estimating the metafrontier production function based on a stochastic frontier framework" which criticizes Battese et al. (J Prod Anal 21:91–103, 2004) and O’Donnell et al. (Empir Econ 34:231–255, 2008) procedure of using linear (quadratic) programming techniques in the second step. Their suggested procedure is easily implemented in Stata. Given the output "y", 3 inputs "x1", "x2", and "x3" and environmental variables z1-z8 and assuming a translog functional form, first generate dummies for the groups reflecting different technology possibility sets, e.g., group1, group2, group3. The ensuing syntax is

Code:

frontier lny c.lnx1##c.lnx1 c.lnx2##c.lnx2 c.lnx3##c.lnx3 c.lnx1#c.lnx2 c.lnx1#c.lnx3 /// c.lnx2#c.lnx3 if group1, distribution(tnormal) cm(z1-z4) /// predict te1 if group1, te predict xb1 if group1, xb

This is the Battese and Coelli's (1992) frontier model for panel data implemented by Stata's -frontier- command. Repeat the same for the rest of the groups. In the second step, use the predicted values as the outcome for the metafrontier equation

Code:

gen pred= . replace pred= xb1 if group1 replace pred= xb2 if group2 replace pred= xb3 if group3 frontier pred c.lnx1##c.lnx1 c.lnx2##c.lnx2 c.lnx3##c.lnx3 c.lnx1#c.lnx2 c.lnx1#c.lnx3 c.lnx2#c.lnx3, distribution(tnormal) cm(z5-z8) predict te_meta, te

Note that the environmental variables in the second stage differ from those in the first. Read Huang et al. for reasons why you need to include environmental variables.

https://link.springer.com/article/10...123-014-0402-2

Last edited by Andrew Musau; 19 May 2017, 07:24.
3 likes
Comment
John Ngombe

Join Date: May 2016

Posts: 13
#3

20 May 2017, 16:13

Thanks a lot Andrew Musau. I will check on work by Huang et al. and will also incorporate your guide and see how it works out. Truly appreciate!!!

Last edited by John Ngombe; 20 May 2017, 16:32.
Comment
Pedro Soares

Join Date: Sep 2017

Posts: 4
#4

20 Sep 2017, 20:24

Andrew Musau,

I tried to replicate Huang's methodology using a primary database (a cross-section with 886 observations), but the estimated results were different than expected, with the technical efficiency in the first step smaller than the one estimated in the second stage (meta-frontier). Is it possible that the low R² of the first stage estimation (close to 0.3) might be generating this problem in the second stage efficiency (by way of the variable pred)? Thank you!
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10089
#5

21 Sep 2017, 10:47

Pedro: If you followed exactly the translog syntax in #2, it is not entirely correct because it neglects to multiply the squared terms by 0.5. If not, it is virtually impossible to advice but here are a few things to check for before estimation

1) Run OLS on your model and check that the residuals have the right skewness (left for a production function, right for cost)
2) Check which functional form is preferred (translog or a reduction to Cobb Douglas)
3) If cost function, test for cost function properties (monotonicity, concavity, etc.)

Once you are OK with these, then you know that your results will make sense. Otherwise, you may just have data problems. The correct syntax for estimating a translog function involves a few more steps which I present below. Given the model

$$
ln y_{i}= a_{0} + a_{1}ln x1_{i}+ a_{2}ln x2_{i}+a_{3}ln x3_{i}+ \frac{1}{2}a_{11}(ln x1_{ i})^{2}\\
+\frac{1}{2}a_{22}(ln x2_{ i})^{2}+\frac{1}{2}a_{33}(ln x3_{ i})^{2}+a_{12}ln x1_{i}*lnx2_{i}\\
+a_{13}ln x1_{i}*lnx3_{i}+a_{23}ln x2_{i}*lnx3_{i }+ u_{i}- v_{i}
$$

Code:

*\\ TAKE THE LOGS OF YOUR VARIABLES (NOT OF THE ENVIRONMENTAL VARIABLES!) foreach var in y x1 x2 x3{ gen double l`var'= ln(`var') } *\\ MULTIPLY SQUARED TERMS BY 0.5 foreach var in x1 x2 x3{ gen double l`var's = 0.5*(l`var')^2 } *\\ RUN THE MODEL frontier ly lx1 lx2 lx3 lx1s lx2s lx3s c.lx1#c.lx2 c.lx1#c.lx3 c.lx2#c.lx3\\ if group1, distribution(tnormal) cm(z1-z4)
1 like
Comment
Pedro Soares

Join Date: Sep 2017

Posts: 4
#6

25 Sep 2017, 11:14

Andrew,

First, thank you for the reply! I tested the production function in the translog, modified translog (following Coelli et al (2003)) and Cobb-Douglas specifications, but for all estimates I observed the same problem. The statistical tests always indicated left skewness, as expected. So, everything leads me to believe that there is some consistency problem in the database.
Comment
Richard Adjei

Join Date: Dec 2020

Posts: 2
#7

09 Dec 2020, 03:56

Hello Andrew,

Thanks for your post. Please I would like to implement the Huang et al. (2014) approach using a cost function for a panel data following the model (with U+V) you specified in #5

I have specified the code following your approach:

sfpanel ly lx1 lx2 lx3 lx4 c.lx1#c.lx2 c.lx1#c.lx3 c.lx1#c.lx4 c.lx2#c.lx3 c.lx2#c.lx4 c.lx3#c.lx4 lx1s lx2s lx3s lx4s if group1, model (bc92) dist(tn) cost
predict ce1 if group1, bc
predict xb1 if group1, xb

and so on for all groups.

1. My understanding from the paper is that the estimated te_meta as you predicted above is the TGR which you multiply with the group specific TE in this case cost efficiency (ce_group) to achieve the meta cost efficiency (ce_meta). Am I wrong?

2. My problem is also that the sfpanel (bc92) does not allow to include the group specific and meta environmental variables using the emean or cmean. Using the 'frontier' as you indicated above or "xfrontier' does not achieve convergence. sfpanel (bc95) approach is not able to fit my model as it gives missing results.

3. Can I include the environmental variables in the level equation for sfpanel(bc92)? when I do so I'm able to estimate the scores. However, when I summarize the descriptive statistics, ce_meta is not equal to TGR*ce_group

I will be grateful for your advise.

Thank you.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10089
#8

09 Dec 2020, 16:12

sfpanel is from SSC.The default when estimating the conditional mean model in frontier is bc95. Sorry for my error in #2, the conditional mean model is not allowed for model(bc92). Therefore, change the model to bc95 in sfpanel and include the environmental variables using the -emean()- option. It is common to have convergence problems when fitting these models and like other maximum likelihood estimations, there are no guarantees, but you can estimate a simpler model first and use the fitted values as starting values for your model with convergence problems. Sometimes this helps. Yes, the predicted values for the groups are the outcome when estimating the metafrontier.

Last edited by Andrew Musau; 09 Dec 2020, 16:26.
Comment
Richard Adjei

Join Date: Dec 2020

Posts: 2
#9

15 Dec 2020, 00:36

Hello Andrews, Thanks a lot for your quick response and clarification. I get it now. I am appreciative.
Comment

Lalith Seelanatha

Join Date: Jan 2021
Posts: 3

#10

16 Jan 2021, 20:54

Dear Andrew
I have run the following command several time and got the following out come. In the sam time i used the FRONTIER4.1 and got a reasonable output. Can you tell me why i am not having the same output in SFPanel.

This is the output i got in FRONT4.1

		coefficient	standard-error	t-ratio
	Beta_0	- 1.023	0.525	- 1.947
LOAN	Beta_1	3.680	0.574	6.409
OTAST	Beta_2	- 0.748	0.309	- 2.422
DEPO	Beta_3	- 1.647	0.563	- 2.927
P1	Beta_4	1.748	0.182	9.624
P2	Beta_5	0.025	0.120	0.211
TREND	Beta_6	- 0.090	0.002	- 55.374
Q1_2	Beta_7	3.018	0.379	7.952
Q1Q2	Beta_8	0.056	0.166	0.338
Q1Q3	Beta_9	- 3.263	0.223	- 14.603
Q2_2	Beta_10	0.287	0.108	2.665
Q2Q3	Beta_11	- 0.391	0.048	- 8.129
Q3_2	Beta_12	3.878	0.154	25.195
P1_2	Beta_13	0.172	0.023	7.569
P1P2	Beta_14	- 0.161	0.021	- 7.557
P2_1	Beta_15	0.088	0.016	5.369
Q1P1	Beta_16	- 0.266	0.079	- 3.341
Q1P2	Beta_17	- 0.097	0.064	- 1.515
Q2P1	Beta_18	- 0.166	0.057	- 2.894
Q2P2	Beta_19	0.155	0.045	3.421
Q3P1	Beta_20	0.433	0.085	5.081
Q3P2	Beta_21	- 0.086	0.054	- 1.602
	Delta_0	- 10.289	2.233	- 4.609
LN_TA	Delta_1	0.465	0.069	6.709
ZSCORE	Delta_2	0.014	0.001	19.683
TCAPR	Delta_3	3.017	0.417	7.236
ROA	Delta_4	- 0.317	0.989	- 0.321
MKTSH	Delta_5	- 4.204	0.703	- 5.978
	sigma-squared	0.040	0.002	16.973
	gamma	1.000	0.001	1,241.127

This is the one i got in SFPanel

sfpanel TCOST LOAN OTAST DEPO P1 P2 K Q1_2 Q1Q2 Q1Q3 Q2_2 Q2Q3 Q3_2 P1_2 P1P2 P2_1 Q1P1 Q1P2 Q2P1 Q2P2 Q3P1 Q3P2, m(bc92) d(tn) emean(LN_TA ZSCORE TCAPR ROA MKTSH) cost

Inefficiency effects model (truncated-normal) Number of obs = 85
Group variable: DMU Number of groups = 11
Time variable: YEAR Obs per group: min = 6
avg = 7.7
max = 8

Prob > chi2 = .
Log likelihood = -8586.3159 Wald chi2(0) = .

------------------------------------------------------------------------------
TCOST | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Frontier |
LOAN | -4.226583 . . . . .
OTAST | -9.411944 . . . . .
DEPO | -13.45925 . . . . .
P1 | 35.23126 . . . . .
P2 | 79.8152 . . . . .
K | 29.99498 . . . . .
Q1_2 | 16.07713 . . . . .
Q1Q2 | 36.79187 . . . . .
Q1Q3 | 28.31715 . . . . .
Q2_2 | 21.82485 . . . . .
Q2Q3 | 35.16767 . . . . .
Q3_2 | 17.75145 . . . . .
P1_2 | 100.8052 . . . . .
P1P2 | 445.8224 . . . . .
P2_1 | 506.8234 . . . . .
Q1P1 | -57.43169 . . . . .
Q1P2 | -113.1178 . . . . .
Q2P1 | -72.09528 . . . . .
Q2P2 | -149.8962 . . . . .
Q3P1 | -55.12123 . . . . .
Q3P2 | -111.6336 . . . . .
_cons | 7.18826 . . . . .
-------------+----------------------------------------------------------------
Mu |
LN_TA | -112.6402 . . . . .
ZSCORE | -673.9753 . . . . .
TCAPR | .1532187 . . . . .
ROA | .9564855 . . . . .
MKTSH | .510407 . . . . .
_cons | -5.320933 . . . . .
-------------+----------------------------------------------------------------
Usigma |
_cons | 199.4999 . . . . .
-------------+----------------------------------------------------------------
Vsigma |
_cons | 199.4999 . . . . .
-------------+----------------------------------------------------------------
sigma_u | 2.09e+43 . . . . .
sigma_v | 2.09e+43 . . . . .
lambda | 1 . . . . .
------------------------------------------------------------------------------
Do you have any idea?

Comment

Andrew Musau

Join Date: Oct 2014

Posts: 10089
#11

17 Jan 2021, 01:25

I assume that you are referring to Tim Coell's program -frontier- that is written in R. sfpanel is from SSC. As I do not use the former, I cannot advise on why you are getting different results. Also, you do not specify what model you are estimating in -frontier- as it can estimate a wide range of stochastic frontier models. From your sfpanel command, your syntax corresponds to the conditional mean model. However, it appears that you are not taking the logs of your variables before estimating. Note that sfpanel does no transformations of the data, so you must do this yourself.
Comment
Lalith Seelanatha

Join Date: Jan 2021

Posts: 3
#12

18 Jan 2021, 06:23

Dear Andrew

Thanks for the prompt reply. I use the st0315. Since, i am using trans-log cost function with three outputs and three input prices, all the data have transformed in to log. I want to estimate BC95 model which permit the estimation of a model for determinants of inefficiency. I thought emean() .provides the ability to estimate the impact of the determinants of inefficiency. Am i using the correct syntax?
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10089
#13

18 Jan 2021, 07:33

Take a look at the Stata Journal article that introduces sfpanel and the textbook by Kumbhakar, Wang and Horncastle. If you have input prices, depending on what assumptions you impose, e.g., linear homogeneity, then there are some steps required before estimation. You will find examples on how to do this in Stata.
Comment
Fung Kwan

Join Date: May 2021

Posts: 1
#14

05 May 2021, 05:50

I tried to estimate the meta-frontier with Huang et al (2014) for BC92 and BC95 models. However, it came with these errors: 1.not concave 2. backed up 3.could not calculate numerical derivatives - flat or discontinuous region. Could anyone give suggestion? Thank you.
Comment
Fadi Ansar

Join Date: May 2021

Posts: 120
#15

08 Feb 2022, 09:46

Originally posted by Andrew Musau View Post

You should have a look at Huang et al. (J Prod Anal 42:241–254, 2014) "A new approach to estimating the metafrontier production function based on a stochastic frontier framework" which criticizes Battese et al. (J Prod Anal 21:91–103, 2004) and O’Donnell et al. (Empir Econ 34:231–255, 2008) procedure of using linear (quadratic) programming techniques in the second step. Their suggested procedure is easily implemented in Stata. Given the output "y", 3 inputs "x1", "x2", and "x3" and environmental variables z1-z8 and assuming a translog functional form, first generate dummies for the groups reflecting different technology possibility sets, e.g., group1, group2, group3. The ensuing syntax is

Code:

frontier lny c.lnx1##c.lnx1 c.lnx2##c.lnx2 c.lnx3##c.lnx3 c.lnx1#c.lnx2 c.lnx1#c.lnx3 /// c.lnx2#c.lnx3 if group1, distribution(tnormal) cm(z1-z4) /// predict te1 if group1, te predict xb1 if group1, xb

This is the Battese and Coelli's (1992) frontier model for panel data implemented by Stata's -frontier- command. Repeat the same for the rest of the groups. In the second step, use the predicted values as the outcome for the metafrontier equation

Code:

gen pred= . replace pred= xb1 if group1 replace pred= xb2 if group2 replace pred= xb3 if group3 frontier pred c.lnx1##c.lnx1 c.lnx2##c.lnx2 c.lnx3##c.lnx3 c.lnx1#c.lnx2 c.lnx1#c.lnx3 c.lnx2#c.lnx3, distribution(tnormal) cm(z5-z8) predict te_meta, te

Note that the environmental variables in the second stage differ from those in the first. Read Huang et al. for reasons why you need to include environmental variables.

https://link.springer.com/article/10...123-014-0402-2

Dear Andrew Musau
How can I estimate elastcity of each input with respect to dependent varaible.Can you please give me the codes that I can use for the estimation of cost elasticity with respect to output
Comment

Announcement