Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pooled cross sectional data vs cross sectional data

    Hi all,

    I am writing and academic paper and am looking to investigate the impact of ones ethnicity on their income using the UK labour force survey.
    I am aware of the difference between cross sectional pooled data and panel data, and panel data analysis can not be done with this data set.
    So, I am deciding between pooled cross sectional data, or just running multiple separate cross sectional regressions for multiple years.
    Can anyone explain the advantages and disadvantages of each please? and also What are the different robustness checks I could do for simple cross sectional data ?

    Thank you

  • #2
    Jorvan:
    welcome to this forum.
    1) the main difference is that, running separate regressions per year means having one intercept per regression (which is questionable). Moreover, running one OLS only you can investigate the effect of time (if any) either as categorical pr continuous variable (see -fvvarlist- notation).
    2) the postestimation routine doesn't differ between the two approaches: -estat ovtest-; -estat hettest-; and, if really interested, -estat vif-.
    If yuor data come froma survey, see also -svy-.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Originally posted by Carlo Lazzaro View Post
      Jorvan:
      welcome to this forum.
      1) the main difference is that, running separate regressions per year means having one intercept per regression (which is questionable). Moreover, running one OLS only you can investigate the effect of time (if any) either as categorical pr continuous variable (see -fvvarlist- notation).
      2) the postestimation routine doesn't differ between the two approaches: -estat ovtest-; -estat hettest-; and, if really interested, -estat vif-.
      If yuor data come froma survey, see also -svy-.

      I would love to do pooled cross sectional investigation, instead of simple cross sectional investigation, but i think It may be too complex to do with the data sets available as I don't know how to merge the different years without losing the ability to identify between the data from the different years, if that makes sense,
      If you know How to do this please let me know as it would make my investigation more interesting.
      Thank you
      Last edited by Jorvan Virk; 17 Apr 2020, 19:35.

      Comment


      • #4
        Jorvan:
        don't you have a time identifier in each file?
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Originally posted by Carlo Lazzaro View Post
          Jorvan:
          don't you have a time identifier in each file?
          yes I do BUT I am not sure how to merge the datasets so its able to identify them in a regression ?

          Comment


          • #6
            Originally posted by Carlo Lazzaro View Post
            Jorvan:
            don't you have a time identifier in each file?
            I do however, I am writing and want to explain why I have not carried out POOLED cross sectional regression (not panel), and instead opted for 5 separate year regressions, so I what are the disadvantages for pooled cross sectional regressions ?

            Comment


            • #7
              Jorvan:
              the main difference is that in a pooled OLS you got one intercept only, whereas you have 5 intercepts if you run 5 1-year regressions.
              Whether is makes sense or not to run 5 separate linear regressions instead of a pooled OLS depends, I guess, on the methodological habits in each research field.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment

              Working...
              X