Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • LIWC-22 Output with Zero Overdispersion

    Dear all,

    I have analyzed news articles with the text analysis tool LIWC-22. LIWC analyzes the frequency of words in various categories such as emotional expressions and cognitive processes and the values are given in percentage of the total word count (i.e. a value of 7.5 in the category "moral" means that 7.5% of all words in the text are associated with moral discussions). So the values for any variable can be between 0 to 100. My dataset does not have a panel structure.
    My research aim is to find out how categories like AI impact categories like moral or negative emotions, etc.

    My problem is that for some variables like moral I have to many zero values, so that normal transformation methods like log or sqrt don't work for me. See an overview below:

    Code:
    summarize moral
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
           moral |     22,339    .2066476     .402975          0       7.09
    
    
    . tabulate moral if moral <=0.1
    
          Moral |
     Legitimacy |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |     11,854       93.61       93.61
            .02 |          8        0.06       93.67
            .03 |         28        0.22       93.90
            .04 |         49        0.39       94.28
            .05 |         67        0.53       94.81
            .06 |         90        0.71       95.52
            .07 |        143        1.13       96.65
            .08 |        185        1.46       98.11
            .09 |        239        1.89      100.00
    ------------+-----------------------------------
          Total |     12,663      100.00
    **Questions:**
    1. Which transformation methods would be best suited for my dependent variable "moral" to perform an OLS regression?
    2. Are there alternative modeling techniques that you would recommend for such a distribution?

    Thank you very much for your help!

  • #2
    poisson estimation?

    Comment


    • #3
      Hello George,

      I was thinking about zero inflated poisson or zero inflated negative binomial models, but they are not suitable for non-integer values. Or am I wrong here?

      Comment


      • #4
        Not a problem. Some type of Poisson/robust would do.

        Comment


        • #5
          Ok, just for understanding, you mean a zero inflated poisson?
          Last edited by Eli Sine; 03 Jul 2024, 11:23.

          Comment


          • #6
            Read this:
            HTML Code:
            https://statisticalhorizons.com/zero-inflated-models/

            Comment

            Working...
            X