Why has the bico variable now disappeared from the master data set? This makes the data sets even less compatible for merging.
Here's the problem you are facing. Your master data set's observations are indexed by an exporter, importer, (bico?), year, and month. In the using data set, the observations are indexed by an exporter, importer, bico (for sure), year, and product category (name/number). To put these together, without aggregating up the data, you would need some rules that decide which month in the master data gets matched with which product in the using data. This seems highly implausible, and I suspect impossible to do even if it seemed to make sense.
Perhaps the solution is to aggregate up the data in the using data set to get one observation per exporter-importer-bico-year combination by averaging in some way the tariffs on the different product categories. (I've already checked: the tariffs differ across product categories even when the exporter, importer, bico, and year are all the same.) So you would need to decide on how to weight the different product categories (perhaps by volume traded--which you would have to get from yet another data set as it doesn't appear here?) in some way to make this work.
Another possibility is to match every product in the using data set to every month in the master data set, having matching values for exporter, importer, (bico?) and year. That will make the combined data set very large, with the observations in it identified by unique combinations of exporter, importer, year, month, and product group. Is that what you want? If so, that code is:
Note: In this code I assume that bico really is still in your master data set. If it's not, just remove it from the -joinby- command.
Again, I don't know if either of these approaches will get you what you want. But I don't see any other possibilities.
Here's the problem you are facing. Your master data set's observations are indexed by an exporter, importer, (bico?), year, and month. In the using data set, the observations are indexed by an exporter, importer, bico (for sure), year, and product category (name/number). To put these together, without aggregating up the data, you would need some rules that decide which month in the master data gets matched with which product in the using data. This seems highly implausible, and I suspect impossible to do even if it seemed to make sense.
Perhaps the solution is to aggregate up the data in the using data set to get one observation per exporter-importer-bico-year combination by averaging in some way the tariffs on the different product categories. (I've already checked: the tariffs differ across product categories even when the exporter, importer, bico, and year are all the same.) So you would need to decide on how to weight the different product categories (perhaps by volume traded--which you would have to get from yet another data set as it doesn't appear here?) in some way to make this work.
Another possibility is to match every product in the using data set to every month in the master data set, having matching values for exporter, importer, (bico?) and year. That will make the combined data set very large, with the observations in it identified by unique combinations of exporter, importer, year, month, and product group. Is that what you want? If so, that code is:
Code:
use master_data, clear joinby exporter importer bico year using using_data
Again, I don't know if either of these approaches will get you what you want. But I don't see any other possibilities.
Comment