Hi, I am stuck in making a loop where I can append and merge in a same loop.
I have the following files which are all 250 MB in size each with identical variable names/type.
1) I have named them as "1.dta", "2.dta", "3.dta" .... "784.dta".
2) I have another file "PEad_ret_test2.dta"
File "1.dta" has only one identifier variable named as "localid" which has a value of "1", File "2.dta" has only one identifier variable named as "localid" which has a value of "2" and so on. Futhermore, the other variable on the basis of which I will be merging is "date".
File "PEad_ret_test2.dta" has all the identifiers i.e., localid from 1-784.
3) I want to merge all the files in (1) with (2) and get a final file "company_N&S.dta"
Now what I want to do is the make a loop in which every time Stata picks up a file like "1.dta", drops certain data and then merges it with "PEad_ret_test2.dta" and then saves it to "company_N&S.dta". Then it should move onto "2.dta", merges it with "PEad_ret_test2.dta" and so on.
I also want to keep a limited number of variables in the final file in order to keep the final size of the file small.
I have tried to write the following code for just "1.dta" and "2.dta" in order to test my loop.
The problem with this code is that it overwrites the previous file.
For example, by the end of this loop, I only have "2.dta" merged with "PEad_ret_test2.dta". Contents of "1.dta" was not there after the merge.
Is there any way to solve this issue?
Thanks
I have the following files which are all 250 MB in size each with identical variable names/type.
1) I have named them as "1.dta", "2.dta", "3.dta" .... "784.dta".
2) I have another file "PEad_ret_test2.dta"
File "1.dta" has only one identifier variable named as "localid" which has a value of "1", File "2.dta" has only one identifier variable named as "localid" which has a value of "2" and so on. Futhermore, the other variable on the basis of which I will be merging is "date".
File "PEad_ret_test2.dta" has all the identifiers i.e., localid from 1-784.
3) I want to merge all the files in (1) with (2) and get a final file "company_N&S.dta"
Now what I want to do is the make a loop in which every time Stata picks up a file like "1.dta", drops certain data and then merges it with "PEad_ret_test2.dta" and then saves it to "company_N&S.dta". Then it should move onto "2.dta", merges it with "PEad_ret_test2.dta" and so on.
I also want to keep a limited number of variables in the final file in order to keep the final size of the file small.
I have tried to write the following code for just "1.dta" and "2.dta" in order to test my loop.
Code:
clear all cd "D:\test" foreach num of numlist 1/2 { use `num'.dta // this step with use the "1.dta" and then "2.dta" in this case. keep if datatype_1 ==1 drop date // dropping this because this is the incorrect date clonevar date = date_stata // this is the correct date merge m:m localid date using "D:\test\PEad_ret_test2.dta", force keep ticker dayofweek folder_number assetcode localid date_stata date keep if _merge==3 save company_N&S.dta, replace }
For example, by the end of this loop, I only have "2.dta" merged with "PEad_ret_test2.dta". Contents of "1.dta" was not there after the merge.
Is there any way to solve this issue?
Thanks
Comment