Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with Combining Datasets

    Hello,

    I am pretty new to Stata and have never merged datasets before. I am trying to combine multiple months (24 months, from January 2020 to December 2021) of the Labour Force Survey from Statistics Canada so that in the combined dataset the observations for occupations listed in variable NOC_40 can be differentiated by month. Given the way the Labour Force Survey is sampled, however, the number of respondents changes between each cycle of the survey. I'm not sure if that effects things.

    Basically, what I am trying to do is run a multinomial logistic regression with given months or groups of months as the independent variables and four recoded occupational groups (from NOC_40) as the dependent variables. The goal is to see what the probability of belonging to a particular occupational group is over time.

    I have tried to follow a few different tutorials for merging, but the observations appear to be the same before and after the merge, so clearly I am doing something wrong.

    Thanks!

  • #2
    Hi Mister, you'd do well to post an example of your data using the dataex command as the FAQ says. That way we can just look at your data without fussing over websites and downloading everything ourselves. It's also good to add what you tried in terms of code rather than "it didn't work," so we can give specific pointers.

    Comment


    • #3
      This sort of longitudinal data is most often combined using the append command to stack the data together end-to-end in a "long" layout, with each observation corresponding to a single respondent in a single month, as it now does, rather than trying to merge the data from each respondent into a "wide" layout with one observation per respondent.

      The experienced users here generally agree that, with few exceptions, Stata makes it much more straightforward to accomplish complex analyses using a long layout of your data rather than a wide layout of the same data. You should try to achieve what you need with the data organized in a long layout, and seek the help of Statalist in doing so. It is much easier, for example, to compare the second observation to the first, the third to the second, and so on, than it is to compare the second variable to the first, the third to the second, etc.

      You may find it helpful to read the "Introduction to xt commands" section of the Stata Longitudinal-Data/Panel-Data Reference Manual PDF Included in your Stata installation and accessible from Stata's Help menu. This should give you some idea of how Stata organizes this sort of data.

      With that said, not much more can be said without a better understanding of your data. Your question really isn't clear without more detail. The Statalist FAQ provides advice on effectively posing your questions, posting data, and sharing Stata output.

      Comment

      Working...
      X