Dear Fellow Researchers
Please excuse my unprofessional wording in advance - it's my first post.
I am currently working on my Bachelor thesis with the intention to publish my work in a B/C Journal. Naturally, a journal article sets higher standards to empirical work than a normal Bachelor thesis. Therefore, I am hoping to find some answers here concerning questions nobody on my campus, whether professor nor PhD student, managed to answer!
Let me first describe my data set. I am empirically investigating the effect of diversification on the fund performance of Private Equity firms. Diversification is measured by a fractional (adapted Hirsch-Herfindahl Index) continous variable. The dependent variable, fund performance is measured fractionally as well, between 0 and 1. I use severall control variables.
Following contemporary research, I used a GLM model. After a Modified Park Test, I am now working with the Poisson model (poisson depvar indepvar, vce(robust)).
QUESTION 1: Goodness of fit
MPT recommends Poisson distribution. Kensel density graph looks alike (for lambda=1). Linktest is okay. The assumption of the Poisson regression is that the mean and the variance of the independent variable are the same. This however is not the case. The mean is around 3 times higher. Therefore, I tested with "estat gof", the results: prob > chi2 = 1, what makes me super suspicious. How come the goodness of fit is that high? Next, I tested for over-dispersion with "nbreg", results are okay I guess, alpha was not significantly different from zero and thus indicates that over-dispersion is not a concern. That's all I tested for. This cannot be it, right?
Do you have any hints how I can properly assess the quality of my dataset and regression results?
QUESTION 2: Robustness Test
How can I properly assess the robustness? What is common in research?
QUESTION 3: Working with Predictions
Working with GLM makes predictions a little bit hard to interpret, even after "irr". I am looking for a way to display in 3d the predictions of the depvar and 2 indepvars. Is it possible to safe the predictions of the poisson regression for these 3 variables and plot them with "surface"?
QUESTION 4: Interaction Terms
Now things get juicy. I read several articles about the difficult use of interaction terms in GLM. I felt like these articles were pretty contradictory and not very practical. Do you have any hints on how to use interaction terms? I built an interaction term between a dummy variable and the diversification variable. The Interaction term is significant and sort of the crown jewel of my work. Can I simply build an interaction term by interactionvar=dummyvar*diversification?
Thank you very much for your answers. If you need any more info, please let me know, I will answer instantly!
Best regards from Switzerland!
Please excuse my unprofessional wording in advance - it's my first post.

I am currently working on my Bachelor thesis with the intention to publish my work in a B/C Journal. Naturally, a journal article sets higher standards to empirical work than a normal Bachelor thesis. Therefore, I am hoping to find some answers here concerning questions nobody on my campus, whether professor nor PhD student, managed to answer!
Let me first describe my data set. I am empirically investigating the effect of diversification on the fund performance of Private Equity firms. Diversification is measured by a fractional (adapted Hirsch-Herfindahl Index) continous variable. The dependent variable, fund performance is measured fractionally as well, between 0 and 1. I use severall control variables.
Following contemporary research, I used a GLM model. After a Modified Park Test, I am now working with the Poisson model (poisson depvar indepvar, vce(robust)).
QUESTION 1: Goodness of fit
MPT recommends Poisson distribution. Kensel density graph looks alike (for lambda=1). Linktest is okay. The assumption of the Poisson regression is that the mean and the variance of the independent variable are the same. This however is not the case. The mean is around 3 times higher. Therefore, I tested with "estat gof", the results: prob > chi2 = 1, what makes me super suspicious. How come the goodness of fit is that high? Next, I tested for over-dispersion with "nbreg", results are okay I guess, alpha was not significantly different from zero and thus indicates that over-dispersion is not a concern. That's all I tested for. This cannot be it, right?

QUESTION 2: Robustness Test
How can I properly assess the robustness? What is common in research?
QUESTION 3: Working with Predictions
Working with GLM makes predictions a little bit hard to interpret, even after "irr". I am looking for a way to display in 3d the predictions of the depvar and 2 indepvars. Is it possible to safe the predictions of the poisson regression for these 3 variables and plot them with "surface"?

QUESTION 4: Interaction Terms
Now things get juicy. I read several articles about the difficult use of interaction terms in GLM. I felt like these articles were pretty contradictory and not very practical. Do you have any hints on how to use interaction terms? I built an interaction term between a dummy variable and the diversification variable. The Interaction term is significant and sort of the crown jewel of my work. Can I simply build an interaction term by interactionvar=dummyvar*diversification?
Thank you very much for your answers. If you need any more info, please let me know, I will answer instantly!
Best regards from Switzerland!

Comment