Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel Analysis with Index Dependent Variable

    Hi there. I have a question that I hope will be simple for someone to answer.

    I am working with panel data (2 waves only) that looks at 10 methods of how people engage in their communities (voting, community meetings, etc.). Each of these 10 methods is a binary variable that indicates whether they engaged in that way or not (0=no; 1=yes). I have created a composite index variable that summarizes the total amount of a person's engagement; for example, if the respondent selected "yes" for 3 methods of engagement, their index score is 3. The minimum index score is 0, the maximum index score is 10, and all scores are non-negative integers. I want this index score to be my dependent variable in a regression model with a handful of predictive independent variables.

    What statistical method for regression should I employ for this panel data that will make sense for having a summary index dependent variable described above?

  • #2
    Without knowing your specific research questions it is hard to give a specific answer. That said, the three most common treatments of a variable like this (not necessarily in this order) would be:

    1. As if it were a continuous variable (e.g. using -regress-)
    2. As a binomial variable (e.g. -glm- with family(binomial 10))
    3. As an ordinal response variable (e.g. using -ologit-)

    And several other treatments are possible, including using -gsem- with a latent variable indicated by the 10 methods instead of a summative index.

    Which of these is most suitable for your purposes depends on what your purposes are. And to an extent it also depends on the distribution of the variable in your data.

    Last edited by Clyde Schechter; 21 Dec 2015, 11:41.

    Comment


    • #3
      Thank you so much for the response. Some follow-up clarification and questions:

      The specific research question would be: "What is the affect of different race-related opinions (IVs) on the index of community engagement (DV mentioned above)?"

      In terms of distribution, I expect a large quantity of DV index scores to be 0, so I am questioning the use of linear regression. I also hadn't thought of treating the index as ordinal since it has an absolute 0 (tied to 0 engagement in the activities that comprise the total index score) and the differences between any 2 values has clear meaning.

      Would poissen regression make sense?

      Comment


      • #4
        Given that there is an absolute ceiling of 10 because there were only 10 opportunities, the Poisson is a bit of a misspecification--using -glm- with a binomial 10 family could be closer to the data generating process. But if there are very few responses near the upper end of that range, then the misspecification with Poisson will be small. Poisson regression is a viable candidate in that case. It is a very flexible model and can be used in many circumstances even when, as in your case, the outcome measure isn't, strictly speaking, a count variable. (It can even be used to great advantage with truly continuous non-negative variables that have skew distributions.) You might find, however, that you encounter a lot of over-dispersion (particularly if the zero spike is really, really big) and then you might need to consider alternatives such as negative binomial, or a zero-inflated model to deal with that.

        With the responses heaping on zero, treating this as a continuous variable is probably a bad idea.

        I agree that treating this as an ordinal variable does discard some information, but I wouldn't rule it out entirely. The zero point is, indeed, absolute. But I'm not so sure that differences between response levels have a clear meaning. That is true if you feel confident that each activity contributing to the index is of equal salience to the underlying construct of community engagement. Only you or other experts in your discipline can decide that--but that is definitely a question you should ponder if you haven't already. If you conclude that there is reason to think that some of the indexed variables are more salient than others, then an ordinal approach might actually be better. Or you might want to construct a weighted index and use that for a Poisson or similar model.

        Comment


        • #5
          This is extremely helpful thought partnership. Thank you! I will think on this and get back to you if I have any other questions. Very much appreciated.

          Comment

          Working...
          X