I am trying to join a set of weather variables identified by longitude (longnum), latitude (latnum) and date(bdate) to individual data by longitude (longnum), latitude (latnum) and birthdate(bdate). There are 61158 files with unique combination of longitude and latitude points, and so 61158 joins. I broke down the process in batches. Some of the files have successfully joined. However not all the resulting joined datasets have observations, which means that the join is not successful across all the files.
Upon checking manually I find that the combination of latnum, longnum and bdate exist in both datasets but they are not getting joined. There are no observations with longnum, latnum, or bdate missing in any of the data sets.
My code is:
The data type for longnum and latnum in heat_*.dta files is float in format %9.0g and bdate is int in format %td.
In the allvars_SA_clean.dta file: longnum and latnum are type double in %10.0g format, while bdate is type float in %td.
I tried to change the data type and format in allvars_SA_clean.dta to match the heat_*.dta with the following code. However, this did not solve the problem.
I am using Stata 18.0.
Is there a way to make sure all the joins are successful?
Upon checking manually I find that the combination of latnum, longnum and bdate exist in both datasets but they are not getting joined. There are no observations with longnum, latnum, or bdate missing in any of the data sets.
My code is:
Code:
clear all set more off global cdd "/Users/tahre/cdd" global cdd_dhs "/Users/tahre/cdd_dhs" forvalues i=35001/36000{ disp `i use "$cdd/heat_`i'.dta", clear joinby bdate longnum latnum using "/Users/tahre/Data/allvars_SA_clean.dta" save "$cdd_dhs/merged_`i'.dta", replace } '
In the allvars_SA_clean.dta file: longnum and latnum are type double in %10.0g format, while bdate is type float in %td.
I tried to change the data type and format in allvars_SA_clean.dta to match the heat_*.dta with the following code. However, this did not solve the problem.
Code:
use allvars_SA_clean.dta, clear *format longitude and latitude to match gridpoints format %9.0g latnum format %9.0g longnum recast int bdate
Is there a way to make sure all the joins are successful?
Comment