Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Double Bootstrap DEA (Simar & Wilson, 2007)

    Dear members,

    I would like to have some help regarding a model I decided to use. I have still some doubts. I hope someone might guide me. I have not found so much in the blog. I hope someone might read and give me a help.

    I attach to the post the main data and the do-file. I want to perform the Simar & Wilson (2007) double bootstrap to analyse the efficiency of schools (simarwilson). I installed it as well as "ftruncreg".

    I have 4 outputs, 5 inputs, 9 environmental variables (some dummy variables). Only those 9 enters as independent variables. The model is input orientation, variable returns to scale, algorithm 2 meaning the efficiency scores are computed internally.

    I have only 1 year as condition. I added the followings: twosided reps(1000) bcreps(100) invert tebc(eff_vrs_o) level(95) dots

    My first question is related a bit on the code and then, the intepretation of the output. I read the help file but I have still some doubts. I added twosided as notwosided is not reccomended with Algorithm 2. Given that the command sets internallity nounit, I did not add such piece of code and let the model run.

    I added invert and the estimated efficiency scores are inverted. Larger efficiency scores indicate inefficiency for the input-oriented model (what I have).

    - Does the model consider only the environmental variables as independent or in the truncation regression, all variables (hence also outputs and inputs) are considered as independent?

    - the etimated after the second loop of bootstrap (the estimates bias-corrected from the truncation regression) should be inverted or those values are already the final ones?

    - Once I add my last input (school_size), the bootstrap takes a while (more than 2h) and Stata, somehow, does not respond anylonger. What happens? Does someone have an explanation for it?

    - Stata outcome shows me "inefficient if eff_vrs_o > 1". I use summarize to see the values of the bias-corrected and all values, also minimum are above 1. This confuses me a bit because, unless I interpreted incorrectly the help file, "...for
    (regular) scores within (0,1], the default (twosided) is to use a two-sided truncated regression model and to sample from the two-sided truncated normal distribution. With twosided, the procedure hence considers that input-oriented (Farrell) efficiency scores are not only less than or equal to 1 but also strictly positive..."

    I would really appreciate if someone who has used or has worked with this model, might help me.

    Thank you

    Attached Files

  • #2
    Dear Simona,
    just a quick answer to your questions regarding simarwilson. (See the log file below that illustrates some of my arguments).

    1. Unlike regression analysis, DEA does not involve the concepts of dependent and independent variables. Inputs and outputs do not enter the truncated regression. Therefore, there is no direct reason to consider them as independent. Since inefficiency affects the quantities of inputs and outputs, they cannot (all) be independent.

    2. If you specify the invert option, simarwilson does the inversion (switching from Farrell to Shephard efficiency) internally, and there is no need to manually transform the efficiency scores - provided the Shephard measure is really what you want.

    3. I added school_size as a third input to the model. Leaving all other code unchanged took 108 seconds on my laptop (see below). Hence, I have no idea what causes the excessive run time on your machine.

    4. Getting scores that are consistently greater than one is just the consequence of specifying the invert option (and calculating bias-corrected scores that do not take the value of exactly one). If you simply remove "invert" from your code, you will get estimated scores that are all within the unit interval. The "notwosided" option is irrelevant to the DEA, and thus to the scores that the DEA yields, but only affects the parametric bootstrap of the truncated regression.

    Hope my answer is of at least of some value to you.
    Best wishes,
    Harald

    Code:
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    name: <unnamed>
    log: C:\...\Response_SimonaFerraro_statalist.log
    log type: text
    opened on: 21 Feb 2024, 12:37:10
    
    . * Double Boostrap DEA (Simar & Wilson, 2007)
    .
    . * simarwilson conducts the internal efficiency analysis using Alg 2.
    . * If the DEA is carried out internally, simarwilson internally sets nounit (inefficiency if eff score < 1) and I do not need to add it.
    .
    . * By invert (Shepard and not Farrell eff measure). All estimated efficiency scores are inverted, scores larger than one indicate inefficiency for the input-oriented.
    . * notwosided is not recommended with algorithm 2
    .
    . ** Set seed to ensure replicability
    . set seed 19023892
    
    .
    . ** Load Data
    . use Main_data.dta, clear
    
    .
    . ** Orginal code of Simona Ferraro
    . simarwilson (matemaatika eesti_keel continuing_studies reverse_dropout = teacher_training teacher_qualification ) median_income keel_oige typee municipality state linnakool maa
    > kool tallinn tartu if year_==2020, algorithm(2) twosided unit rts(vrs) base(in) reps(1000) bcreps(100) invert tebc(eff_vrs_o) level(95) dots
    
    Bootstrap (bias correction) replications (100)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    .................................................. 50
    .................................................. 100
    
    Bootstrap (conf. intervals) replications (1000)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    .................................................. 50
    .................................................. 100
    .................................................. 150
    .................................................. 200
    .................................................. 250
    .................................................. 300
    .................................................. 350
    .................................................. 400
    .................................................. 450
    .................................................. 500
    .................................................. 550
    .................................................. 600
    .................................................. 650
    .................................................. 700
    .................................................. 750
    .................................................. 800
    .................................................. 850
    .................................................. 900
    .................................................. 950
    .................................................. 1000
    
    Simar & Wilson (2007) eff. analysis Number of obs = 348
    (algorithm #2) Number of efficient DMUs = 0
    Number of bootstr. reps = 1000
    Wald chi2(9) = 50.96
    inefficient if eff_vrs_o > 1 Prob > chi2(9) = 0.0000
    
    ------------------------------------------------------------------------------
    Data Envelopment Analysis: Number of DMUs = 348
    Number of ref. DMUs = 348
    input oriented (Shephard) Number of outputs = 4
    variable returns to scale Number of inputs = 2
    bias corrected efficiency measure Number of reps (bc) = 100
    
    ------------------------------------------------------------------------------
    | Observed Bootstrap Percentile
    inefficiency | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    eff_vrs_o |
    median_inc~e | -.0076518 .0023666 -3.23 0.001 -.0123481 -.0031095
    keel_oige | .0015061 .0009857 1.53 0.127 -.0002973 .003461
    typee | -.0049645 .0336034 -0.15 0.883 -.0656221 .0655787
    municipality | .1804942 .0592178 3.05 0.002 .0724297 .3010025
    state | .1370724 .2032406 0.67 0.500 -.2836536 .531626
    linnakool | .2078946 .0966011 2.15 0.031 .0324536 .4192952
    maakool | .0385492 .0962633 0.40 0.689 -.138435 .248432
    tallinn | .2810482 .0944552 2.98 0.003 .1110525 .484972
    tartu | .1959864 .1033036 1.90 0.058 .004681 .4265907
    _cons | 1.459573 .1722798 8.47 0.000 1.083181 1.767136
    -------------+----------------------------------------------------------------
    /sigma | .2476732 .0098829 25.06 0.000 .2254373 .2642594
    ------------------------------------------------------------------------------
    
    .
    . ** Descriptive statistics for the bias-corrected efficiency scores
    . sum eff_vrs_o
    
    Variable | Obs Mean Std. dev. Min Max
    -------------+---------------------------------------------------------
    eff_vrs_o | 348 1.705392 .263333 1.063821 2.270815
    
    .
    . ** Add school_size as additional input variable
    . cap drop eff_vrs_o
    
    . timer clear 1
    
    . timer on 1
    
    . simarwilson (matemaatika eesti_keel continuing_studies reverse_dropout = teacher_training teacher_qualification school_size) /*
    > */ median_income keel_oige typee municipality state linnakool maakool tallinn tartu if year_==2020, algorithm(2) twosided unit rts(vrs) base(in) reps(1000) bcreps(100) invert t
    > ebc(eff_vrs_o) level(95) dots
    
    Bootstrap (bias correction) replications (100)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    .................................................. 50
    .................................................. 100
    
    Bootstrap (conf. intervals) replications (1000)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    .................................................. 50
    .................................................. 100
    .................................................. 150
    .................................................. 200
    .................................................. 250
    .................................................. 300
    .................................................. 350
    .................................................. 400
    .................................................. 450
    .................................................. 500
    .................................................. 550
    .................................................. 600
    .................................................. 650
    .................................................. 700
    .................................................. 750
    .................................................. 800
    .................................................. 850
    .................................................. 900
    .................................................. 950
    .................................................. 1000
    
    Simar & Wilson (2007) eff. analysis Number of obs = 348
    (algorithm #2) Number of efficient DMUs = 0
    Number of bootstr. reps = 1000
    Wald chi2(9) = 81.44
    inefficient if eff_vrs_o > 1 Prob > chi2(9) = 0.0000
    
    ------------------------------------------------------------------------------
    Data Envelopment Analysis: Number of DMUs = 348
    Number of ref. DMUs = 348
    input oriented (Shephard) Number of outputs = 4
    variable returns to scale Number of inputs = 3
    bias corrected efficiency measure Number of reps (bc) = 100
    
    ------------------------------------------------------------------------------
    | Observed Bootstrap Percentile
    inefficiency | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    eff_vrs_o |
    median_inc~e | -.0059043 .0023091 -2.56 0.011 -.0104083 -.0015038
    keel_oige | .0011902 .0009883 1.20 0.228 -.0007657 .0032026
    typee | -.0457486 .0332903 -1.37 0.169 -.1085907 .0182037
    municipality | .1840001 .0585082 3.14 0.002 .0686036 .3069073
    state | .1015493 .1937717 0.52 0.600 -.3137163 .4680878
    linnakool | .2137572 .0911967 2.34 0.019 .043333 .3927926
    maakool | -.0189099 .0912014 -0.21 0.836 -.2001557 .1545485
    tallinn | .2527479 .0904463 2.79 0.005 .07103 .424427
    tartu | .1791965 .0978454 1.83 0.067 -.0097767 .3851756
    _cons | 1.501424 .168919 8.89 0.000 1.170017 1.834643
    -------------+----------------------------------------------------------------
    /sigma | .2453481 .0094185 26.05 0.000 .2230933 .2604626
    ------------------------------------------------------------------------------
    
    . timer off 1
    
    . timer list 1
    1: 107.81 / 1 = 107.8120
    
    .
    . ** Same as the original code just WITHOUT option INVERT
    . cap drop eff_vrs_o
    
    . simarwilson (matemaatika eesti_keel continuing_studies reverse_dropout = teacher_training teacher_qualification ) median_income keel_oige typee municipality state linnakool maa
    > kool tallinn tartu if year_==2020, algorithm(2) twosided unit rts(vrs) base(in) reps(1000) bcreps(100) tebc(eff_vrs_o) level(95) dots
    
    Bootstrap (bias correction) replications (100)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    .................................................. 50
    .................................................. 100
    
    Bootstrap (conf. intervals) replications (1000)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    .................................................. 50
    .................................................. 100
    .................................................. 150
    .................................................. 200
    .................................................. 250
    .................................................. 300
    .................................................. 350
    .................................................. 400
    .................................................. 450
    .................................................. 500
    .................................................. 550
    .................................................. 600
    .................................................. 650
    .................................................. 700
    .................................................. 750
    .................................................. 800
    .................................................. 850
    .................................................. 900
    .................................................. 950
    .................................................. 1000
    
    Simar & Wilson (2007) eff. analysis Number of obs = 348
    (algorithm #2) Number of efficient DMUs = 0
    Number of bootstr. reps = 1000
    inefficient if eff_vrs_o < 1 Wald chi2(9) = 61.58
    twosided truncation Prob > chi2(9) = 0.0000
    
    ------------------------------------------------------------------------------
    Data Envelopment Analysis: Number of DMUs = 348
    Number of ref. DMUs = 348
    input oriented (Farrell) Number of outputs = 4
    variable returns to scale Number of inputs = 2
    bias corrected efficiency measure Number of reps (bc) = 100
    
    ------------------------------------------------------------------------------
    | Observed Bootstrap Percentile
    efficiency | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    eff_vrs_o |
    median_inc~e | .0022289 .0007628 2.92 0.003 .0007148 .0038124
    keel_oige | -.0008111 .0002959 -2.74 0.006 -.001378 -.0002127
    typee | -.0014844 .0104352 -0.14 0.887 -.0212229 .0199318
    municipality | -.060399 .0178857 -3.38 0.001 -.0979019 -.0261534
    state | -.2299003 .0614881 -3.74 0.000 -.3518595 -.104871
    linnakool | -.0594728 .0273929 -2.17 0.030 -.1110802 -.0069498
    maakool | -.0139055 .0277085 -0.50 0.616 -.0674768 .0412561
    tallinn | -.1040808 .0282378 -3.69 0.000 -.1574813 -.0491706
    tartu | -.0839063 .0321504 -2.61 0.009 -.144252 -.0207181
    _cons | .6742883 .0544838 12.38 0.000 .57141 .7807546
    -------------+----------------------------------------------------------------
    /sigma | .0805208 .0029744 27.07 0.000 .0734268 .0849844
    ------------------------------------------------------------------------------
    
    .
    . ** Descriptive statistics for the bias-corrected efficiency scores
    . sum eff_vrs_o
    
    Variable | Obs Mean Std. dev. Min Max
    -------------+---------------------------------------------------------
    eff_vrs_o | 348 .5496639 .0878639 .2806361 .896414
    
    . log close
    name: <unnamed>
    log: C:\...\Response_SimonaFerraro_statal ist.log
    log type: text
    closed on: 21 Feb 2024, 12:39:46
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Last edited by Harald Tauchmann; 21 Feb 2024, 05:57.

    Comment

    Working...
    X