Dear all,
I have analyzed news articles with the text analysis tool LIWC-22. LIWC analyzes the frequency of words in various categories such as emotional expressions and cognitive processes and the values are given in percentage of the total word count (i.e. a value of 7.5 in the category "moral" means that 7.5% of all words in the text are associated with moral discussions). So the values for any variable can be between 0 to 100. My dataset does not have a panel structure.
My research aim is to find out how categories like AI impact categories like moral or negative emotions, etc.
My problem is that for some variables like moral I have to many zero values, so that normal transformation methods like log or sqrt don't work for me. See an overview below:
**Questions:**
1. Which transformation methods would be best suited for my dependent variable "moral" to perform an OLS regression?
2. Are there alternative modeling techniques that you would recommend for such a distribution?
Thank you very much for your help!
I have analyzed news articles with the text analysis tool LIWC-22. LIWC analyzes the frequency of words in various categories such as emotional expressions and cognitive processes and the values are given in percentage of the total word count (i.e. a value of 7.5 in the category "moral" means that 7.5% of all words in the text are associated with moral discussions). So the values for any variable can be between 0 to 100. My dataset does not have a panel structure.
My research aim is to find out how categories like AI impact categories like moral or negative emotions, etc.
My problem is that for some variables like moral I have to many zero values, so that normal transformation methods like log or sqrt don't work for me. See an overview below:
Code:
summarize moral Variable | Obs Mean Std. dev. Min Max -------------+--------------------------------------------------------- moral | 22,339 .2066476 .402975 0 7.09 . tabulate moral if moral <=0.1 Moral | Legitimacy | Freq. Percent Cum. ------------+----------------------------------- 0 | 11,854 93.61 93.61 .02 | 8 0.06 93.67 .03 | 28 0.22 93.90 .04 | 49 0.39 94.28 .05 | 67 0.53 94.81 .06 | 90 0.71 95.52 .07 | 143 1.13 96.65 .08 | 185 1.46 98.11 .09 | 239 1.89 100.00 ------------+----------------------------------- Total | 12,663 100.00
1. Which transformation methods would be best suited for my dependent variable "moral" to perform an OLS regression?
2. Are there alternative modeling techniques that you would recommend for such a distribution?
Thank you very much for your help!
Comment