Question regarding multiple observations

Keith Richardson

Join Date: Mar 2020

Posts: 8
#1

Question regarding multiple observations

04 Mar 2020, 13:29

I will try to explain my issue as good as I possibly can. However, I am by no means an expert. Therefore, I was hoping some of you experts out there could help in resolving my issue.

I have a dataset which consists of a survey with approx. 5.000 observations/respondents. Each respondent is linked to his/her corresponding municipality. In total, there are 91 municipalities and each respondent has been assigned to his/her municipality. I have added a new variable, population density, to my dataset. This variable shows the population density of each municipality at the given year 2013. However, this variable only contains 91 observations (one for each municipality) but it uses the same ID as my other dataset even though it only has one observation from each.

Now my question is how do I make it so that my new variable, population density, matches my other variable in terms of observations? I have a unique identifier for each municipality but multiple observation from each. So I basically want it to match the number of observations from each municipality.

I have one variable which contains the unique ID/number of the municipality for each respondent. There are multiple respondents from the same municipality.

For example:

Municipality ID | Gender | Municipality ID Population density (for each municipalty)
101 M 101 1.776
101 M 104 564
101 F
104 M
104 F
…. ….. (approx 5.000 obs. in total) ….. (only 91 obs. in total - one for each municipality)

Is there any smart way to do this other than having to do it manually? Like repeat the population density for municipality 101 for each observation from that given municipality.

Your help is much appreciated.
Tags: None
Mike Lacy

Join Date: Apr 2014

Posts: 2404
#2

04 Mar 2020, 13:37

What you describe is termed a merge in the statistical package world. See -help merge-. The variable(s) that link observations from two files is known as a key. From the point of view of your respondent data set, what you have is an m:1 merge, that is, many persons match to one municipality in your othe rfile.
Comment
Keith Richardson

Join Date: Mar 2020

Posts: 8
#3

04 Mar 2020, 14:21

Thanks, I was able to figure it out by following your instructions.
Comment

Announcement

Question regarding multiple observations

Comment

Comment