Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Proper method for analyzing trends in GSS data

    I would like to use GSS data to examine the relationship between two variables over time. The GSS is a longitudinal survey consisting of different respondents each wave. Waves are 1 year apart, then 2 years apart beginning in the 1990s. The questions reflect policy preferences and I'm curious if these preferences are predictive of each other; it's not really a cause and effect relationship, thus I don't see a need for lagged variables. However, the data are going to be correlated over time, so I'm thinking I'll need to account for this autocorrelation. One question is dichotomous and the other is ordinal (3 categories).

    I'm having trouble identifying the proper method for the project. My initial thought was time series like arima/armax. However, the data are individual level and so arima is going to require me to collapse the data from each wave to reflect the mean score on each of these questions. This would seem to throw away a great deal of useful information including the variability in responses from one wave to the next. Alternatively, I considered the xt suite of commands but the data are not panel data (unless the panels are conceptualized as waves). I know I'm missing an obvious and easy solution.





  • #2
    Actually, the GSS now has both a repeated cross sectional component (different respondents in each year) and a longitudinal or panel component. You need to consider whether you want to use one component or the other, or both. If you are using the repeated cross sectional component, there is no need to use XT-type models but if you only use the panel component you need to account for the nested data structure in some way. If you are combining the two sample components things get a bit trickier.

    There is an edited volume of papers, all or almost all based on the repeated cross sectional component,that will give you a great deal of insight into how to go about analyzing these data. See Peter V. Marsden (ed.) Social Trends in American Life: Findings from the General Social Survey. Princeton University Press, 2012.

    You should note that the GSS has a complex survey design, the details of which have changed over time. A GSS codebook appendix contains a description of the design You will want to pay close attention to it and then use Stata's survey analysis features when you estimate models. .
    Richard T. Campbell
    Emeritus Professor of Biostatistics and Sociology
    University of Illinois at Chicago

    Comment


    • #3
      I've seen that they have 3 3-wave panels and have considered doing something with this as well. However, the first of the panels is 2006 and I'd like to look at trends in these variables reaching back through the 80's and 90's.

      I'm thinking about two analyses - one of the panels and another of the trends. With the panels the obvious choice is a multi-level model. I'm very familiar with these. The problem is, I still don't know what command or suite of commands to use for the trend analysis. I know there's got to be something better than Arima which is going to require me to aggregate the data by wave.

      Comment


      • #4
        I don't think it is an ARIMA-type problem. What you appear to be assuming is that individual-level errors in equations will be correlated over time, presumably due to the exclusion of macro level variables in your models. You could, of course, introduce macro level variables into your models, e.g. unemployment rate or an equality measure and then treat your cases as being nested in year. The question then, I guess, is how to specify the covariance structure of the random effects at which point, frankly, I am beyond my depth.
        Richard T. Campbell
        Emeritus Professor of Biostatistics and Sociology
        University of Illinois at Chicago

        Comment


        • #5
          The hypothesis is essentially that opinions on two dissimilar policies are, in fact, connected. I wish to examine this over time to see if trends in support for these policies are predictive of one another. Obviously, public opinion on any given issue is going to be relatively stable from year to year, hence my concern about autocorrelation. For example, public opinion on things like gay marriage, gun control, affirmative action, etc. changes over time but is going to be very stable year to year. This is another reason why I am not overly excited about using the 3 wave panel data - most of the variation in these measures in that short time span is, I think, going to reflect cross sectional variation and not temporal variation but it's the latter that I'm interested in.

          I agree that arima is not a great fit because it is going to require me to aggregate by year/wave and then model the mean over time. The immediate alternative would seem to be some sort of multi-level model where cases would be nested within year. However, in order to use the autoregressive error structure in the XT commands I have to have both a time variable and a panel variable... but the time variable (year) would 'be' the panel variable in this conceptualization. So that clearly won't work. I'm feeling like there must be a class of models that would suit this sort of question. Is this something that could be done with growth curves perhaps?

          Comment

          Working...
          X