Dear Stata-Community,
I am currently grappling with a challenge related to setting up a logistic regression model in Stata. Specifically, I am working with a highly unbalanced dataset that contains firm-level variables as dependent variables. Given the nature of the data and the possibility of fixed effects at both national and yearly levels, I am seeking your expertise to help me navigate this hurdle effectively.
Here's a brief summary of my dataset:
Despite conducting multiple tests, I am struggling to identify the most suitable model setup. I have explored several options, including:
xtset NATION YEAR
repeated time values within panel
r(451);
Even after setting "xtset NATION," I continued to face issues:
xtlogit DEPVAR INDVAR, fe
note: multiple positive outcomes within groups encountered.
1,991 (group size) take 1,635 (# positives) combinations results in numeric overflow;
computations cannot proceed
r(1400);
I would deeply appreciate your explicit support and guidance on this matter. Your insights are invaluable to me as I endeavor to optimize my regression model and derive the right conclusions from the data.
Thank you very much for your support.
Kind regards,
Michael
I am currently grappling with a challenge related to setting up a logistic regression model in Stata. Specifically, I am working with a highly unbalanced dataset that contains firm-level variables as dependent variables. Given the nature of the data and the possibility of fixed effects at both national and yearly levels, I am seeking your expertise to help me navigate this hurdle effectively.
Here's a brief summary of my dataset:
NATION | YEAR | DEP. VAR. (0 or 1) | VAL. DEP. VAR. | IND. VAR. | VAL. IND. VAR. |
Nation A | 2009 | Variable Dep | Value 1 | Variable Ind | Value 1 |
Nation A | 2011 | Variable Dep | Value 2 | Variable Ind | Value 2 |
Nation B | 2009 | Variable Dep | Value 3 | Variable Ind | Value 3 |
Nation B | 2010 | Variable Dep | Value 4 | Variable Ind | Value 4 |
Nation B | 2011 | Variable Dep | Value 5 | Variable Ind | Value 5 |
Nation B | 2012 | Variable Dep | Value 6 | Variable Ind | Value 6 |
Nation B | 2013 | Variable Dep | Value 7 | Variable Ind | Value 7 |
Nation C | 2012 | Variable Dep | Value 8 | Variable Ind | Value 8 |
Nation D | 2011 | Variable Dep | Value 9 | Variable Ind | Value 9 |
Nation D | 2012 | Variable Dep | Value 10 | Variable Ind | Value 10 |
Nation D | 2013 | Variable Dep | Value 11 | Variable Ind | Value 11 |
Despite conducting multiple tests, I am struggling to identify the most suitable model setup. I have explored several options, including:
- logit DEPVAR INDVAR, vce(cluster CommonIdentifier_NATION-YEAR)
- logit DEPVAR INDVAR i.NATIONDUMMY i.YEARDUMMY
- logit DEPVAR INDVAR i.NATIONDUMMY, vce(YEAR)
- logit DEPVAR INDVAR i.YEARDUMMY, vce(NATION)
xtset NATION YEAR
repeated time values within panel
r(451);
Even after setting "xtset NATION," I continued to face issues:
xtlogit DEPVAR INDVAR, fe
note: multiple positive outcomes within groups encountered.
1,991 (group size) take 1,635 (# positives) combinations results in numeric overflow;
computations cannot proceed
r(1400);
I would deeply appreciate your explicit support and guidance on this matter. Your insights are invaluable to me as I endeavor to optimize my regression model and derive the right conclusions from the data.
Thank you very much for your support.
Kind regards,
Michael
Comment