Independent sample t-test

Omar Shaher

Join Date: Feb 2019
Posts: 164

Independent sample t-test

05 Feb 2021, 18:42

Dear researchers,
I have unbalanced panel data for a set of firms. In 2002 a new standard has been issued, and accordingly, firms started to adopt the standard, but the adoption process is not simultaneous. I mean a group of firms adopted in 2005, other groups adopted in 2003, and so on (i.e., the adoption is in different years). The following table is an example of my dataset.

Firms	Year	Event	Leverage
A	2000	0	1.23
A	2001	0	0.45
A	2002	0	0.435
A	2003	0	0.675
A	2004	0	0.896
A	2005	1	0.6043
A	2006	1	0.56
A	2007	1	0.5157
A	2008	1	0.4714
B	2000	0	0.4271
B	2001	0	0.3828
B	2002	0	0.3385
B	2003	1	0.2942
B	2004	1	0.2499
B	2005	1	0.2056
B	2006	1	0.1613

As far as I know that if I want to see if there is a significant difference in leverage before and after the adoption of the standards, the number of observations before the adoption should equal the number of observation after the adoption, Am I correct? if no, then can I use the Independent sample t-test, or what I should do?
The code that I have used is:

Code:

ttest Leverage, by(Event)

Please advise.

Many thanks in advance.

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#2

05 Feb 2021, 19:16

As far as I know that if I want to see if there is a significant difference in leverage before and after the adoption of the standards, the number of observations before the adoption should equal the number of observation after the adoption, Am I correct?

No, that's not correct. I have seen other people ask this same question--there is no such requirement and I don't know where people have gotten this idea from. Unlike many common statistical fallacies, this is not, as far as I know, in any textbooks, nor is it part of our folklore.

if no, then can I use the Independent sample t-test, or what I should do?

No, the t-test is not adequate here. The reason has nothing to do with number of observations. It's because you have multiple observations within firms, and they are not independent of each other. Moreover, the firms themselves may well differ in leverage even early when none had adopted the new standard, so just lumping them together will produce misleading results. And the firms may also differ on other factors that are associated with leverage as well.

What you actually have here is a generalized difference-in-differences setup. The code you need is:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input str2 firms int year byte event float leverage "A " 2000 0 1.23 "A " 2001 0 .45 "A " 2002 0 .435 "A " 2003 0 .675 "A " 2004 0 .896 "A " 2005 1 .6043 "A " 2006 1 .56 "A " 2007 1 .5157 "A " 2008 1 .4714 "B " 2000 0 .4271 "B " 2001 0 .3828 "B " 2002 0 .3385 "B " 2003 1 .2942 "B " 2004 1 .2499 "B " 2005 1 .2056 "B " 2006 1 .1613 end encode firms, gen(firm) xtset firm year xtreg leverage i.event i.year, fe

In this model, the coefficient of 1.event is the generalized difference-in-differences estimate of the causal effect of the adoption of the new standard on firms' leverage.

In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
2 likes
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#3

06 Feb 2021, 03:43

Omar:
you possibly got the wrong idea (as Clyde wisely highlighted) of the same before/after number of observations from the so called mirror studies, where the sample unit (e.g., a patient) acts as a control of itself/herself/himself before/after the administration of a given (healthcare) procedure. However, mirror studies are often plagued with missing values (e.g., drop-outs): hence, the number of before/after observations is rarely the same.

Kind regards,
Carlo
(Stata 19.0)
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#4

06 Feb 2021, 17:33

Dear Clyde and Carlo,

Thank you very much for your answers. Greatly appreciated.

The thing is if I applied the code:

xtreg leverage i.event i.year, fe

This will show me as Clyde mentioned

the coefficient of 1.event is the generalized difference-in-differences estimate of the causal effect of the adoption of the new standard on firms' leverage

However, If I used the above code, then I will not be able to see the mean before the adoption and after the adoption, and whether there are significant differences in the averages or not.
Is there any way to do that?

If the independent t-test is totally incorrect in my case, then can I take the average for observations before the adoption and after the adoption for each firm, and then using the independent t-test, if this is possible, is there any code that will help me to get the mean for the observations before and after the adoption for each firm.

I would be very grateful if you could help.

Many thanks in advance.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#5

06 Feb 2021, 18:23

However, If I used the above code, then I will not be able to see the mean before the adoption and after the adoption, and whether there are significant differences in the averages or not.
Is there any way to do that?

After the -xtreg- command run

Code:

margins event

and you will get the means before and after. As for comparing those means, go back to the -xtreg, fe- output itself and examine the row in the table for 1.event. The coefficient is the difference between the means, and the rest of the row gives you a standard error, a z-statistic, a p-value and a 95% confidence interval. What more can you ask for?
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#6

06 Feb 2021, 19:41

Dear Prof. Clyde,
I can't thank you enough, you have changed my whole understanding of the topic, and after running the codes, it seems everything related to my research question becomes different and makes more sense.

One last question, please.
If I want to graph the trend of leverage for all firms across years, I mean to see how it was low before the adoption then become low after the adoption, is there any code that could help me in this regard, or do you think the below code is the correct one to use:

Code:

margins event, at(Leverage = (list of interesting values of Leverage)) marginsplot

I did that but the STATA gave me that leverage not found in the list of covariates

Many thanks in advance.

Last edited by Omar Shaher; 06 Feb 2021, 20:18.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#7

07 Feb 2021, 05:39

Omar:
have you already double-checked that -Leverage- was right typed (instead of -leverage-)?

Kind regards,
Carlo
(Stata 19.0)
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#8

07 Feb 2021, 07:52

Dear Carlo,

Thanks for the reply. Much appreciated.

Yes, I did that but it doesn't work!!!
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#9

07 Feb 2021, 10:39

Dear Prof. Clyde,
I am very interested in the generalized DID with the two-way fixed-effect model. I wish you could answer these two main questions, and I will be very grateful to you. Your answer to the following two questions will help me to clearly identify my research novelty. I promise these are going to be the last two questions in this post.
Let us assume that I have unbalanced panel data for a set of firms for the year extends from 2000-2019. In 2010 a new standard has been issued, and accordingly, firms started to adopt the standard, but the adoption process is not simultaneous. I mean a group of firms adopted in 2010, other groups adopted in 2011, other groups adopted in 2012, and so on. The main variables of interest are growth and leverage. Where:
Event coded 1 in the years of adoption, and zero otherwise.
Thus, my main aims are as below:
Examining the impact of Event on leverage, and I can get that from the following code:

Code:

ge id= _n encode Companyname, gen(COMPANY) xtset COMPANY Year, yearly xtreg Leverage i.Event i.Year, fe

And, if I want to see the average of Leverage before and after the adoption of the standards, then after the xreg run, I will use the following command:

Code:

Margins Event

So, my first question, is there any way to graph the trend of Leverage for all firms across years to show how Leverage was before the adoption and how it becomes after the adoption?

My second aim is to find the relationship between Leverage as an independent variable, with the growth as a dependent variable, before and after the adoption of the standards, also to see the causal impact of the standards on the relationship between growth and Leverage.
To do that, I have used the following code:

Code:

xtreg growth i.Event##(c.Leverage) i.Year, fe cluster ( COMPANY)

As far as I know that the coefficient of
Leverage will show me the relationship between Leverage and growth before the adoption of the standards.

Leverage##Event will show me the incremental effect.

The summation of the coefficients of the above A and B will show me the relationship between leverage and growth after the adoption of the standards.

OR after the xtreg run, I can run the following code to get the relationship between Leverage and growth before and after the adoption of the standards:

Code:

Margins Event, dydx (Leverage)

So, my question here, since I can get three different things from the generalised DID with two way fixed effect model (i.e. the relationship between leverage and growth before the adoption of the standards, the relationship between leverage and growth after the adoption of the standards, the casual effect of the standards on the relationship between growth and leverage), so do you think it is okay to develop three hypotheses for each one mentioned above, and I can judge the three hypotheses from one model?

Million thanks in advance
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#10

07 Feb 2021, 13:47

So, my first question, is there any way to graph the trend of Leverage for all firms across years to show how Leverage was before the adoption and how it becomes after the adoption?

Code:

margins Year#Event marginsplot

So, my question here, since I can get three different things from the generalised DID with two way fixed effect model (i.e. the relationship between leverage and growth before the adoption of the standards, the relationship between leverage and growth after the adoption of the standards, the casual effect of the standards on the relationship between growth and leverage), so do you think it is okay to develop three hypotheses for each one mentioned above, and I can judge the three hypotheses from one model?

Well, since the causal effect is estimated as the difference between the (adjusted) leverage after and before, these are not three independent hypotheses. Given any two of them, the answer to the third is automatic, so although it can look like you have three hypotheses, really there will only be two. But the most important consideration here is what hypotheses are important? What hypothesis(es) would people care about? There is no ideal number of hypotheses.

I won't go on a long rant here, but this is not the kind of situation in which I would test any hypotheses at all. My approach here is instead to present the leverage before and after adoption and the causal effect estimate, all with their confidence intervals. To interpret the results, I compare them not to a null value of zero, but to some threshold of real-world meaningfulness. As you are dealing here in finance/economics, which I know very little about, I can't advise you more specifically than that--I have no sense about what would be a real-world meaningful value of the effect of the policy on leverage. But my interest would be whether the confidence interval for the effect lies fully to one side of that value, or whether it straddles that value. If the latter, then we can say that our study is inconclusive as to whether the policy was meaningfully effective and that more or better data or a sharper model might be needed. If the meaningfulness threshold is outside the confidence interval then we can make a claim that the policy was or was not meaningfully effective (depending on which side of the threshold the confidence interval lies).
1 like
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#11

07 Feb 2021, 17:43

Dear Prof. Clyde,

I can't thank you enough
You have provided me with a lot of invaluable information, which will change my thinking in the research in a very wonderful way, I can say that I have learned a lot from you and will not forget until the rest of my life.

margins Year#Event marginsplot

I should use the above code after the following code:

Code:

xtreg Leverage i.Event i.Year, fe

Overall, it will be like this:

Code:

xtreg Leverage i.Event i.Year, fe margins Year#Event marginsplot

This in turn will give me the trend of leverage for all firms across years before and after the adoption of the standards, and if I want to examine the trend of another variable, then I will replace Leverage in the above code with any other variable?
Am I correct?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#12

07 Feb 2021, 17:55

Almost. -marginsplot- is a separate command that belongs on its own line.
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#13

07 Feb 2021, 18:44

Yes, sure, it should be in a separate command.

I am very grateful to you Prof. Clyde,
I will never forget your help. Greatly appreciated.

My deepest respect to you.
Comment
Omar Shaher

Join Date: Feb 2019

Posts: 164
#14

08 Feb 2021, 05:43

Dear Prof. Clyde,
I apologize again for posting again here and for asking again, but it was my curiosity that drove me to show you my result.
Could you please devote some of your valuable time to see the below graph because I have a question, please.

After running the following codes:

Code:

xtreg Leverage i.Event i.Year, fe margins Year#Event marginsplot

Q1: Can I comment like this:
The trend of leverage before and after the adoption of the standards was decreasing across years for all firms, however, the trend of leverage after the adoption is higher, and here when we say the trend of leverage that means the level of leverage for all firms across years after the adoption of the standards is higher, but the overall trend is decreasing.
Q2: how come there is a blue line (Event=0) in both years 2018 and 2019 if all firms have adopted the standards in 2018.

Attached Files
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#15

08 Feb 2021, 11:14

The trend of leverage before and after the adoption of the standards was decreasing across years for all firms, however, the trend of leverage after the adoption is higher, and here when we say the trend of leverage that means the level of leverage for all firms across years after the adoption of the standards is higher, but the overall trend is decreasing.

Correct. However, the difference between the two curves is very small compared to the changes over time.

how come there is a blue line (Event=0) in both years 2018 and 2019 if all firms have adopted the standards in 2018.

-marginsplot- graphs the output of -margins-. -margins- calculates the predicted outcomes according to the regression command. So the point on the blue curve in 2019 is your regression's prediction of what leverage would have been in 2019 if the policy had not already been adopted by everyone.
1 like
Comment

Announcement

Independent sample t-test

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment