dear all, I now face a problem that is arising from the large volume and limited computer capacity (RAM:8GB).
I now want to merge the two datasets using the code:
the first dataset merge1.dta with 0.2 million observations:
------------------ copy up to and including the previous line ------------------
Listed 50 out of 214688 observations
the second dataset resident.dta with aroun 8 million observations
----------------------- copy starting from the next line -----------------------
------------------ copy up to and including the previous line ------------------
Listed 50 out of 8070791 observations
The computer dies when I run the aforementioned code. Now I want to learn if it's possible that I can delete some observations in the 8m dataset using the nfhs variable and then merge them. anyone has any advice on how to realize this purpose?
I now want to merge the two datasets using the code:
Code:
merge 1:1 nfhs whhid lineno using resident.dta
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double nfhs str12 whhid byte lineno 3 " 28 1 1" 2 3 " 28 1 1" 4 3 " 28 1 2" 2 3 " 28 1 2" 5 3 " 28 1 4" 3 3 " 28 1 6" 5 3 " 28 1 7" 2 3 " 28 1 8" 3 3 " 28 1 8" 5 3 " 28 1 8" 6 3 " 28 1 10" 4 3 " 28 1 11" 2 3 " 28 1 13" 1 3 " 28 1 14" 2 3 " 28 1 14" 4 3 " 28 1 14" 6 3 " 28 1 14" 7 3 " 28 1 16" 2 3 " 28 1 20" 3 3 " 28 1 22" 2 3 " 28 1 24" 2 3 " 28 1 25" 2 3 " 28 1 26" 2 3 " 28 1 26" 3 3 " 28 1 27" 2 3 " 28 1 28" 2 3 " 28 1 30" 2 3 " 28 1 30" 3 3 " 28 1 31" 2 3 " 28 1 31" 5 3 " 28 1 31" 7 3 " 28 1 32" 4 3 " 28 1 32" 6 3 " 28 2 1" 2 3 " 28 2 2" 2 3 " 28 2 2" 3 3 " 28 2 3" 2 3 " 28 2 3" 4 3 " 28 2 5" 2 3 " 28 2 6" 2 3 " 28 2 7" 2 3 " 28 2 8" 2 3 " 28 2 9" 2 3 " 28 2 14" 2 3 " 28 2 14" 4 3 " 28 2 16" 4 3 " 28 2 18" 2 3 " 28 2 19" 2 3 " 28 2 20" 2 3 " 28 2 21" 4 end
Listed 50 out of 214688 observations
the second dataset resident.dta with aroun 8 million observations
----------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double nfhs str12 whhid byte lineno 2 " 2 1 3" 19 2 " 2 1 12" 6 2 " 2 1 21" 4 2 " 2 1 30" 36 2 " 2 1 39" 25 2 " 2 1 48" 5 2 " 2 1 57" 37 2 " 2 1 66" 31 2 " 2 1 75" 23 2 " 2 1 84" 36 2 " 2 1 93" 28 2 " 2 1 102" 12 2 " 2 1 111" 6 2 " 2 1 120" 44 2 " 2 1 129" 11 2 " 2 1 138" 18 2 " 2 1 147" 7 2 " 2 1 156" 4 2 " 2 1 165" 27 2 " 2 1 174" 2 2 " 2 1 183" 2 2 " 2 1 192" 28 2 " 2 1 201" 26 2 " 2 1 210" 38 2 " 2 1 219" 32 2 " 2 1 228" 30 2 " 2 1 237" 14 2 " 2 1 255" 11 2 " 2 1 264" 38 2 " 2 2 5" 1 2 " 2 2 10" 15 2 " 2 2 15" 17 2 " 2 2 20" 41 2 " 2 2 25" 22 2 " 2 2 30" 29 2 " 2 2 35" 25 2 " 2 2 40" 35 2 " 2 2 45" 22 2 " 2 2 50" 35 2 " 2 2 55" 32 2 " 2 2 60" 46 2 " 2 2 65" 25 2 " 2 2 70" 28 2 " 2 2 75" 41 2 " 2 2 80" 24 2 " 2 2 85" 2 2 " 2 2 90" 22 2 " 2 2 95" 35 2 " 2 2 100" 44 2 " 2 2 105" 42 end
Listed 50 out of 8070791 observations
The computer dies when I run the aforementioned code. Now I want to learn if it's possible that I can delete some observations in the 8m dataset using the nfhs variable and then merge them. anyone has any advice on how to realize this purpose?
Comment