Year Fixed effect xtlogit

Ihab Man

Join Date: Jul 2020

Posts: 56
#1

Year Fixed effect xtlogit

03 Oct 2020, 17:15

Dear All
Please, I have seen some posts related to many fixed effect, but I don’t find my case solution and I hope you can help me with my questions.
Please, I have a panel data set for 500 companies from 11 countries with regular period (2000-2010) with 9 explanatory variables (not dummies) and my Dep is Dummy(0,1). I did not use xtlogit, Fe because many observations drooped. Then I tried to use Xtlogit, re vce (cluster Company ID).
When I run the command (Xtlogit Dep Ind Var i. Year i. Country , re vce (cluster Company ID) Stata tell me that :

note: 2001.Year != 0 predicts failure perfectly
note: 2010.Year omitted because of collinearity
In this case, what happened here please? And how can I fix it? But when I did not introduce i. Year no message note
When I lag my independent variables by one year by write L.Var1 should I sort the data because panel data or the same with and without sort?Thank you so much
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#2

03 Oct 2020, 17:56

The first note is telling you that in all observations of the estimation sample, the dependent variable is 1 in every observation where year == 2001. Logistic regression coefficients are estimated by maximum likelihood, and when a certain variable can exactly predict the dependent variable in every case, the maximum likelihood estimate of its coefficient is (plus or minus) infinity. In other words, the estimation could not possibly converge. So Stata (other statistical packages do the same thing) checks for this ahead of time and resolves the problem by removing that variable and all of the observations for which exact prediction takes pace. So the first thing you need to do is determine whether this is a problem or whether to just let it be. So you have to ask yourself why Dep always = 1 when Year = 2001. If this is supposed to happen, then there is no problem: the regression without year 2001 is the appropriate regression and you don't need to do anything. If it is not supposed to be the case that Dep is always 1 when Year = 2001 then it means that your data is wrong. In that case you will have to get corrected data.

The second note about 2010.Year omitted because of colinearity also may or may not actually be a problem. Normally, when you use indicator ("dummy") variables, whether explicitly or through factor variable notation as here, one level of the variable, the baseline value, is omitted. Most of the time that is enough to eliminate colinearity in the data. But if one of your independent variables is, for example, corresponds to, for example Year > 2005, or Year < 2007, or some similar condition, then that imposes new colinearity with the year variables, and one more has to be omitted. So again, the question is, do you have a variable that is supposed to be like that. If you do, you can, again, accept the regression results as they are, or, if you prefer, you can rerun the regression omitting that variable (which will not change the results of anything other than the year variables). If you don't have any variable that is supposed to work that way, then, again the problem is that your data are wrong and you need to get corrected data.

I'm also concerned about your reason for using -re- instead of -fe-. If your attempt at -fe- led to "many observations dropped," that suggests that you have a large number of companies for which your dependent variable is always 0 in all years, or always 1 in all years. The correct conclusion from this is that your data provide very little information about the relationship between your independent variables (or any other variables) and your dependent variable. -xtlogit, fe- is kind enough to, in effect, tell you as much, whereas -xtlogit, re- will go ahead and try to give you an answer anyway. But if your data really just aren't informative, you are wasting your own time trying to analyze this question in them. You either need to reconsider your question or get better data.

In light of the above three paragraphs, I am inclined to think it is highly likely that your data is either incorrect, or, if it is correct, that it nevertheless probably isn't the right data to answer your research question. I can't say that with greater certainty because I have not seen the data, and I don't know what the variables are. But all three of these things point in the same direction: the data are probably wrong or insufficient for the question. In any case, none of these three situations would be dealt with by changing your code. All of these are either OK as they are, or they are signs that your data is in error.

Concerning your questoin about lagged variables, the data do need to be sorted by time within panel to use the lag operators. However, -xtset- (or -tsset-) does this sorting automatically. So unless you have changed the sort order of your data between running -xtset- and running -xtlogit- there is no need to add a sort command. If you have changed the sort order, then you should run -sort Company Year- before using the lag variables.
1 like
Comment
Ihab Man

Join Date: Jul 2020

Posts: 56
#3

03 Oct 2020, 18:40

Dear Clyde Schechter
Thank you so much for your replay, I appreciate that too much
Well, the thing is the Dep always = 0 not 1 when Year = 2001 in my case not 1.
In my case, I have 500 companies, 70 out of them just have financial instrument, for example company 1 have interments only on 2005, so in excel before introduce the file in Stata I have count like this
Company 1 in year 2005 equal 1 and the rest years equal 0 and so on, some companies have in 2005 and 2007 and like this .finally I have 120 instrument (I mean 121 =1) , well 500 -70 = 430 companies equal 0 in all years.
When I introduce them into Stata the dep var in excel was (Y)
I created dummy like this:
generate X = 0
replace X = 1 if Y == 1
Please, what I did here is correct? Based on your suggestion that the data may not correct?
Well I used re not fe because more and less 428 company’s dropped and still only 71 some think like that . That’s why I was not sure what happened. Please, what’s your comment now when I explain these lines? Thank you so much
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#4

03 Oct 2020, 20:40

Well, the thing is the Dep always = 0 not 1 when Year = 2001 in my case not 1

OK. But it's the same story. Year 2001 should be excluded from your analysis. You can either drop those observations yourself, or just re-run as before and, as Stata told you in the note, it will get rid of them for you.

In my case, I have 500 companies, 70 out of them just have financial instrument, for example company 1 have interments only on 2005, so in excel before introduce the file in Stata I have count like this
Company 1 in year 2005 equal 1 and the rest years equal 0 and so on, some companies have in 2005 and 2007 and like this .finally I have 120 instrument (I mean 121 =1) , well 500 -70 = 430 companies equal 0 in all

OK, but you don't explain whether having a financial instrument is your dependent variable or an independent variable. I'm guessing it's the dependent. If there are 430 companies where the dependent variable is 0 in all years, then those companies would tell you nothing about the relationship to any other variable on a within-company basis. The key thing to remember is that a fixed-effects regression is a within-company regression. Differences between companies play no role in fixed-effects regression. That is why those 430 drop from -xtlogit, fe-. They have no relevant information. Now, a random effects regression is not a within-company estimator; differences between companies do matter, so these companies still can be informative about how your dependent variables relate to the independent variable across companies. So you need to think about your research goal: is it a within-company relationship you seek to study, or an across-companies relationship? If it's the former, then you must stick to the fixed-effects estimator and accept the fact that most of your data is not useful. If it is an across-companies relationship you are trying to study, then you should have used -re- in the first place, and you still can.

When I introduce them into Stata the dep var in excel was (Y)
I created dummy like this:
generate X = 0
replace X = 1 if Y == 1

Well it seem like this code does nothing other than make X a copy of Y (or perhaps a copy of Y except that X is 0 when Y is neither 0 nor 1, e.g. missing value). I'm not sure what the point is.
Comment
Ihab Man

Join Date: Jul 2020

Posts: 56
#5

04 Oct 2020, 02:09

Dear Clyde Schechter
Thank you so much again
Yes, it’s the dependent variable.
Well, the thing is I want to see which one form the indep var have makes the company to take a decision to have or buy these instruments or why the company have this instrument or to know the determents of these instrument. Unfortunately, previous research is rare in this question and they not explain anything about they used fixed or random effect. Just said logit with clustering standard error at company level and with without year and country fixed effect do you have any idea now based on these lines? So I am confusing now and I want to thank you again for your advices to thing again.
According to the X Y you said you are not sure what the point is. I thought the Var Y in excel when I introduce it into Stata I thought that the Stata need to identify it as dummy with Stata own command not with excel. Is it correct? Thank you so much
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#6

04 Oct 2020, 15:01

Well, "logit with clustering standard error at company level" means not using -xtlogit- at all. That means -logit DepVar IndVars, vce(cluster(Company)))-. (And one could include i.Year in there if desired.)

The choice of the proper model depends on the specifics of the research question and your understanding of the real-world data-generating process. You haven't said enough about that for anyone to give you concrete advice here. Even with fuller information about that, the input you seek needs to come from somebody in finance or economics, as the question is as much substantive as it is statistical. So I'm not the right person to help out here.

As for the thing with X and Y, once you import data from Excel (or any other source) to Stata, Stata will have created a variable that looks like the source. You can easily enough see that by running -tab X Y, miss- and you will see what the relationship between them is.
Comment

Announcement

Year Fixed effect xtlogit

Comment

Comment

Comment

Comment

Comment