Hi Statalisters,
I am a novice user in Stata. I'm working with Stata.14 and Windows 7.
I'm working on a Panel Data Set for all commerical banks in the U.S. for the period 1995 - 2018 (time variable). So I have data on a bank-year level. I created the ID Variable with the variables bank name and cert I already calculated four bank risk proxies: Z-Score, NPA (non-performing assets), LLP (loan loss provisions) and LLR (loan loss reserves) on a bank-year level.
I calculated the Risk Proxy Z_score and I would like to run the binary probability model explaining the occurrence of a bank failure ( Failure = 1, Active = 0) with the risk proxy (lagged by one year).
I did this command to get for "Failure" = 1 and for "Active" = 0 for my binary outcome variable.
This is the dataset with 172 431 observations:
I have Panel Data, so I started with this commands to run the probit regression.[ I forgot to add the ,vce (cluster id) and I think the cformat(%09.0g) pformat(%05.0g) sformat(%08.0g) is irrelevant]
The binary probability model explaining the occurrence of a bank failure ( Failure = 1, Active = 0) with the Z_score (lagged by one year).
This is the regression result:
Question1: It took a long time to receive the estimation results. Well, I'm working with Stata.14 and Windows 7 and with 172 431 observations, but is there a code to run it quicker?
Question2: In my "Guiding Paper" they assess the Z-Score Model on its Pseudo R2. I know that the "normal" probit regeression Output shows me the Pseudo R2 and there is a way to calculate the Pseudo R2 for the xtprobit Panel Data Probit Regression. I know that the pseudo R2 is stored with e(r2 p) and I got to this calculation https://www.stata.com/support/faqs/s...ics/r-squared/
Unfortunatly I can't get it together to calculate the Pseude R2 for my xtprobit case.
Concern: The regression results are far away from them in my Guiding Paper and I think I did something wrong ... Maybe with the binary dependent variable status?
Thank you very much for your support!
I am a novice user in Stata. I'm working with Stata.14 and Windows 7.
I'm working on a Panel Data Set for all commerical banks in the U.S. for the period 1995 - 2018 (time variable). So I have data on a bank-year level. I created the ID Variable with the variables bank name and cert I already calculated four bank risk proxies: Z-Score, NPA (non-performing assets), LLP (loan loss provisions) and LLR (loan loss reserves) on a bank-year level.
I calculated the Risk Proxy Z_score and I would like to run the binary probability model explaining the occurrence of a bank failure ( Failure = 1, Active = 0) with the risk proxy (lagged by one year).
I did this command to get for "Failure" = 1 and for "Active" = 0 for my binary outcome variable.
Code:
merge m:1 cert using `dataset1', assert(match master) Result # of obs. ----------------------------------------- not matched 674,977 from master 674,977 (_merge==1) from using 0 (_merge==2) matched 23,238 (_merge==3) ----------------------------------------- . gen byte status = (_merge == 3) . label define status 0 "Active" 1 "Failed" . label values status status
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(id year status Z_score) 7 1995 1 -1.4038005 10 1995 0 -1.434213 11 1995 0 -1.5771302 14 1995 0 -1.758422 16 1995 0 -1.4295077 21 1995 0 -1.329172 27 1995 0 -1.3730284 32 1995 0 -1.3627455 34 1995 0 -1.908463 38 1995 0 -1.8048723 41 1995 0 -1.5905398 46 1995 0 -1.533159 47 1995 0 -1.663955 48 1995 0 -1.518417 52 1995 0 -1.3701818 53 1995 0 -1.485621 56 1995 0 -1.63249 59 1995 0 -1.476241 76 1995 0 -1.3577138 82 1995 0 -1.3661845 84 1995 0 -1.3885205 85 1995 0 -1.5949416 87 1995 0 -2.0597448 99 1995 0 -2.2821965 101 1995 0 -1.5937258 104 1995 0 -1.6237373 end format %ty year
Code:
xtset id year, yearly panel variable: id (unbalanced) time variable: year, 1995 to 2018, but with gaps delta: 1 year
Code:
xtprobit status Z_score L.year, re
Code:
Fitting comparison model: Iteration 0: log likelihood = -22279.067 Iteration 1: log likelihood = -20346.952 Iteration 2: log likelihood = -20275.616 Iteration 3: log likelihood = -20275.347 Iteration 4: log likelihood = -20275.347 Fitting full model: rho = 0.0 log likelihood = -20275.347 rho = 0.1 log likelihood = -14462.788 rho = 0.2 log likelihood = -12201.376 rho = 0.3 log likelihood = -10943.278 rho = 0.4 log likelihood = -10144.299 rho = 0.5 log likelihood = -9623.8447 rho = 0.6 log likelihood = -9287.5278 rho = 0.7 log likelihood = -9127.5396 rho = 0.8 log likelihood = -9205.7192 Iteration 0: log likelihood = -9047.5173 Iteration 1: log likelihood = -7511.1642 Iteration 2: log likelihood = -4988.8547 Iteration 3: log likelihood = -4534.5219 Iteration 4: log likelihood = -3701.1825 Iteration 5: log likelihood = -3659.677 (not concave) Iteration 6: log likelihood = -3595.4789 Iteration 7: log likelihood = -3595.4789 (backed up) Iteration 8: log likelihood = -3564.2206 Iteration 9: log likelihood = -3557.6508 Iteration 10: log likelihood = -3557.6302 Iteration 11: log likelihood = -3557.6302 Random-effects probit regression Number of obs = 156,147 Group variable: id Number of groups = 14,692 Random effects u_i ~ Gaussian Obs per group: min = 1 avg = 10.6 max = 23 Integration method: mvaghermite Integration pts. = 12 Wald chi2(2) = 111.93 Log likelihood = -3557.6302 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ status | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Z_score | .8909937 .0877437 10.15 0.000 .7190193 1.062968 | year | L1. | -.0197096 .0049606 -3.97 0.000 -.0294322 -.0099869 | _cons | 34.67511 9.94973 3.49 0.000 15.17399 54.17622 -------------+---------------------------------------------------------------- /lnsig2u | 2.519077 .0220374 2.475884 2.562269 -------------+---------------------------------------------------------------- sigma_u | 3.523794 .0388276 3.448509 3.600723 rho | .9254684 .0015201 .9224338 .9283934 ------------------------------------------------------------------------------ LR test of rho=0: chibar2(01) = 3.3e+04 Prob >= chibar2 = 0.000
Question2: In my "Guiding Paper" they assess the Z-Score Model on its Pseudo R2. I know that the "normal" probit regeression Output shows me the Pseudo R2 and there is a way to calculate the Pseudo R2 for the xtprobit Panel Data Probit Regression. I know that the pseudo R2 is stored with e(r2 p) and I got to this calculation https://www.stata.com/support/faqs/s...ics/r-squared/
Code:
regress weight length predict weightp if e(sample) corr weight weightp if e(sample) di r(rho)^2
Concern: The regression results are far away from them in my Guiding Paper and I think I did something wrong ... Maybe with the binary dependent variable status?
Thank you very much for your support!
Comment