Hello all,
I have two datasets
1. Containing data on invoices paid by different companies (180,000 observations)
2. Time series financial information per company (17,000 observations)
Firstly these data sets require to be merged by company id, and secondly by closest date from the invoice date to the date of the financial information. The financial information is generally reported monthly, at the end of the month, but this is not always the case. Invoice dates are distributed randomly over the month.
The dates corresponding to the financial information are not an exact match to the invoice dates and so cannot be directly matched. As you can tell by the number of observations, some observations of financial information will be used for multiple invoices. I would like the data to be merged in a way that the invoice date is matched to the closest corresponding date in the financial dataset, either before or after the invoice date. I would also like the option to set limits on this matching process. For example, if there is no financial information available for either two months before or after the invoice date, I would prefer the data not be merged.
Any advice how to proceed would be much appreciated.
Let me know if you require any other information.
I have two datasets
1. Containing data on invoices paid by different companies (180,000 observations)
2. Time series financial information per company (17,000 observations)
Firstly these data sets require to be merged by company id, and secondly by closest date from the invoice date to the date of the financial information. The financial information is generally reported monthly, at the end of the month, but this is not always the case. Invoice dates are distributed randomly over the month.
The dates corresponding to the financial information are not an exact match to the invoice dates and so cannot be directly matched. As you can tell by the number of observations, some observations of financial information will be used for multiple invoices. I would like the data to be merged in a way that the invoice date is matched to the closest corresponding date in the financial dataset, either before or after the invoice date. I would also like the option to set limits on this matching process. For example, if there is no financial information available for either two months before or after the invoice date, I would prefer the data not be merged.
Any advice how to proceed would be much appreciated.
Let me know if you require any other information.
Comment