Creating new observation which is the average of 2 existing observations

Benthe Vrijsen

Join Date: Apr 2022

Posts: 11
#1

Creating new observation which is the average of 2 existing observations

06 Apr 2022, 05:11

Hi there,
I am trying to perform a m:m merge on 2 datasets at district-year level. Total number of variables is 111.
In the Master dataset, there is a district called Narowal.
In the using dataset, this district has been split up into Narowal 1 and Narowal 2.
This obviously gives me trouble in performing the merge.
Would there be a possibility to create a new observation in the using dataset called Narowal, which takes the average values of all variables of Narowal 1 and Narowal 2?
Or is there any other way to handle this issue?
Thanks!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

06 Apr 2022, 12:36

Code:

clonevar district2 = district replace district2 = "Narowal" if inlist(district, "Narowal 1", "Narowal 2") collapse (mean) variables_to_be_averaged, by(district2 year)

This code is untested because no example data was provided. It relies on assumptions about the nature of your data that are educated guesses. But if those educated guesses are wrong, we will have both wasted our time. So, in the future, when asking for help with code, use the -dataex- command to show example data, and eliminate the guessing. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

All of that said,it is a near certainty thatYOU SHOULD NOT BE PERFORMING AN M:M MERGE. Why StataCorp has not eliminated this trap for the unwary escapes me--they are well aware that it produces nonsense results in almost all situations and they say as much in the user-manuals. I have been using Stata almost daily since 1994 and in my entire experience I have only once encountered a situation where what -merge m:m- does would produce a usable result. It just makes a data salad.

The fact that you think you should be using an m:m merge implies that either your data sets are wrong or you just don't understand them, or you do not understand what -merge- does. If you want help figuring out what you should be doing, with your data, post back in a new thread, use -dataex- to show example data from both data sets, and explain carefully how you want to pair up the observations in them.
Comment
Benthe Vrijsen

Join Date: Apr 2022

Posts: 11
#3

07 Apr 2022, 05:20

Hi Clyde,
thank you for your elaborate reply and advice!
I created a new thread about the merging issue here.
Comment

Announcement

Creating new observation which is the average of 2 existing observations

Comment

Comment