Dear Statalisters,
my database contains 457 teachers (IDprof) in 41 schools (IDescola) in a panel structure with T=5 (year). But the panel data is strongly unbalanced (550 records and 457 unique values).
For each teacher, I have a list of Ordered Categorical Dependent Variables related to teaching practices (q010 q012 q013 q014 q015) and a list of explanatory variables related to the school and student characteristics ($ControlVar).
Below (a part of the) database, in case you wanna try yourself.
----------------------- copy starting from the next line -----------------------
------------------ copy up to and including the previous line ------------------
Listed 25 out of 550 observations
So, I create a DiD to investigate whether the intervention generated some impact on the teacher practices.
Since I am running several regressions, I am afraid that multiple testing could be a problem. So, I would like to correct the p-values by multi-hypothesis testing procedures, such as Bonferroni, FDR or Romano-Wolf.
I found a lot of theoretical foundation related to multiple testing in internet, but only very little information about its implementation using Stata.
What I would like to do is very simple:
After the esttab above, I would like to run the regressions (foreach) again WITH the multi-hypothesis testing procedure and save these "corrected" p-values. Then, I will print esttab again with the coefficients and p-values from the first estimation AND the "corrected" p-values in a new row. (Similar to this example).
The idea is simple, but I could not implement it in Stata. Does anyone have an idea?
Any advice would be highly appreciated!
Many thanks in advance.
my database contains 457 teachers (IDprof) in 41 schools (IDescola) in a panel structure with T=5 (year). But the panel data is strongly unbalanced (550 records and 457 unique values).
For each teacher, I have a list of Ordered Categorical Dependent Variables related to teaching practices (q010 q012 q013 q014 q015) and a list of explanatory variables related to the school and student characteristics ($ControlVar).
Below (a part of the) database, in case you wanna try yourself.
----------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input byte(q010 q012 q013 q014 q015 e029 e025 e024 e027 e026 q036 q038 q037 q039 q044 q045 q046) float(DiD time treated) byte grade int year long IDescola 5 4 2 2 4 4 1 1 3 1 1 1 1 1 1 1 1 0 0 1 3 2007 35913005 4 4 3 4 3 2 3 1 1 1 1 1 1 .a 2 1 .a 0 0 0 3 2006 35283685 2 4 4 4 3 2 3 1 1 1 1 1 2 2 3 2 2 0 0 0 3 2007 35283685 4 5 5 5 5 4 3 1 3 1 1 1 1 1 2 1 1 0 0 0 1 2005 35088614 3 4 3 3 4 4 3 1 3 1 1 1 1 2 2 .a 2 0 0 0 1 2005 35088614 5 5 4 5 5 4 3 2 3 1 1 1 1 1 2 1 1 0 0 0 2 2006 35071122 5 5 5 5 5 4 3 2 3 1 1 1 1 1 1 2 1 0 0 0 1 2005 35071122 3 4 3 3 3 1 3 1 3 1 3 1 3 2 2 2 2 0 0 0 1 2005 35083860 5 5 4 5 5 2 3 1 1 1 2 2 3 2 2 3 3 0 1 0 4 2008 35283685 5 4 5 4 5 3 3 1 3 1 .a .a .a .a .a .a .a 0 0 1 3 2007 35018399 5 5 4 3 5 .a 3 1 3 1 2 3 3 3 2 3 3 0 0 0 1 2005 35059122 4 4 .a .a 4 3 1 1 3 1 1 1 .a 1 1 1 1 0 0 1 1 2005 35905446 5 5 5 4 5 4 3 1 3 3 1 1 1 1 1 2 1 0 0 1 2 2006 35901124 .a 5 5 5 5 2 1 1 3 1 .a .a .a .a .a .a .a 0 0 1 2 2005 35042648 5 5 5 5 5 3 1 1 3 1 1 1 1 1 1 1 1 1 1 1 4 2008 35018824 5 4 4 4 4 4 3 1 3 3 1 1 2 1 1 1 1 0 0 0 1 2005 35059237 5 5 5 5 5 4 3 1 3 3 1 1 1 1 1 1 1 0 0 1 1 2005 35901124 2 4 3 2 2 1 3 1 3 1 2 2 2 .a 2 3 3 0 0 0 3 2007 35083860 5 5 5 5 5 4 1 1 3 1 1 1 1 1 1 1 1 0 0 1 2 2006 35924957 5 4 3 3 3 3 1 1 3 1 1 1 1 1 2 1 1 0 0 1 1 2005 35905446 5 5 5 5 5 2 3 1 1 1 1 1 1 1 1 3 3 0 0 0 3 2007 35283685 5 5 1 1 3 .a 3 1 3 1 1 2 1 2 2 3 2 0 1 0 4 2008 35083811 3 3 3 2 4 4 3 2 3 1 1 1 1 1 1 1 1 0 0 0 1 2005 35071122 3 3 3 3 3 4 3 2 3 1 3 2 3 2 2 2 2 0 0 0 2 2006 35071122 4 4 4 4 4 1 3 1 3 1 1 2 2 1 2 2 1 0 0 0 2 2006 35083860 end label values q010 q001 label values q012 q001 label values q013 q001 label values q014 q001 label values q015 q001 label def q001 2 "Nível 2", modify label def q001 3 "Nível 3", modify label def q001 4 "Nível 4", modify label def q001 5 "Nível 5 (concordo totalmente)", modify label def q001 .a "[9]Respostas inválidas", modify label def q001 1 "Nível 1 (discordo totalmente)", modify label values e029 e029 label def e029 1 "Não tem", modify label def e029 2 "Quase nunca é usada", modify label def e029 3 "Usada eventualmente por alunos", modify label def e029 4 "Há programação regular para uso", modify label values e024 e023 label values e025 e023 label values e026 e023 label values e027 e023 label def e023 1 "Não Tem", modify label def e023 2 "Tem, mas não é usado", modify label def e023 3 "Tem e é usado", modify label values q036 q036 label values q037 q036 label values q038 q036 label values q039 q036 label values q044 q036 label values q045 q036 label values q046 q036 label def q036 1 "Não impede", modify label def q036 2 "Em alguma medida", modify label def q036 3 "Impede muito", modify label def q036 .a "[9]Respostas inválidas", modify
Listed 25 out of 550 observations
So, I create a DiD to investigate whether the intervention generated some impact on the teacher practices.
Code:
* For the foreach functions global Yvar q010 q012 q013 q014 q015 // Teaching practices global ControlVar e029 e025 e024 e027 e026 q036 q038 q037 q039 q044 q045 q046 // School and student Features global cluster ", vce(cluster IDescola)" gen Y = . * Estimation eststo clear foreach outcome in $Yvar { replace Y = `outcome' qui oprobit Y DiD time treated i.grade i.year $ControlVar, cluster(IDescola) eststo } esttab, cells(b(star fmt(3)) p(par fmt(3))) numbers pr2(3) keep(DiD time treated) starl(* 0.1 ** 0.05 *** 0.01) replace -------------------------------------------------------------------------------------------- (1) (2) (3) (4) (5) Y Y Y Y Y b/p b/p b/p b/p b/p -------------------------------------------------------------------------------------------- Y DiD -0.184 0.436* -0.026 0.062 -0.158 (0.554) (0.072) (0.926) (0.813) (0.643) time -0.528 0.681 0.420 0.452 0.415 (0.380) (0.208) (0.557) (0.552) (0.537) treated 0.250 -0.159 0.251 0.388 0.497* (0.320) (0.473) (0.346) (0.152) (0.070) -------------------------------------------------------------------------------------------- N 430 433 433 436 438 pseudo R-sq 0.047 0.033 0.027 0.039 0.053 -------------------------------------------------------------------------------------------- . end of do-file
Since I am running several regressions, I am afraid that multiple testing could be a problem. So, I would like to correct the p-values by multi-hypothesis testing procedures, such as Bonferroni, FDR or Romano-Wolf.
I found a lot of theoretical foundation related to multiple testing in internet, but only very little information about its implementation using Stata.
What I would like to do is very simple:
After the esttab above, I would like to run the regressions (foreach) again WITH the multi-hypothesis testing procedure and save these "corrected" p-values. Then, I will print esttab again with the coefficients and p-values from the first estimation AND the "corrected" p-values in a new row. (Similar to this example).
The idea is simple, but I could not implement it in Stata. Does anyone have an idea?
Any advice would be highly appreciated!
Many thanks in advance.
Comment