Hi all -
I'm running into an issue where a reviewer would like us to understand if our data has influential observations that may be influencing results. They suggested using dfbeta, however, as other Stata users have commented before this approach does not work with panel data. We are do a Hausman Taylor regression (xthtaylor) given that our data is from social media and we are trying to account for endogenous effects of the algorithm in displaying content to users.
Are there any other work arounds that could possibly work to understand the influence of outliers in our data on our regression results? I've tried running dfbeta using a simple OLS regression, saving the values and then creating a dummy variable for when the values from dfbeta are greater than 2/sqrt(n), then including this dummy variable in our Hausman Taylor regressions, but I don't believe that's the best way to go about this.
Any suggestions of what we could try instead?
Thanks in advance for your help!
I'm running into an issue where a reviewer would like us to understand if our data has influential observations that may be influencing results. They suggested using dfbeta, however, as other Stata users have commented before this approach does not work with panel data. We are do a Hausman Taylor regression (xthtaylor) given that our data is from social media and we are trying to account for endogenous effects of the algorithm in displaying content to users.
Are there any other work arounds that could possibly work to understand the influence of outliers in our data on our regression results? I've tried running dfbeta using a simple OLS regression, saving the values and then creating a dummy variable for when the values from dfbeta are greater than 2/sqrt(n), then including this dummy variable in our Hausman Taylor regressions, but I don't believe that's the best way to go about this.
Any suggestions of what we could try instead?
Thanks in advance for your help!