Hello all,
My question in a nutshell:
How can I make a variable ‘farmer’ as shown in Data Set 2 with the information from Data Set 1? (Farmer being a variable that differentiates between non-apple farmers (like HH2), apple farmers (like HH1 and 4), and non-farmers (like HH3).
(End goal is to compare income among non-apple farmers, apple farmers and other households in Data Set 2.)
Sample Data:
Data Set 1:
Data Set 2:
More information:
I have two data sets from a household survey. The first one contains only farming data, and lists products households produce with multiple rows for each household. (See Data Set 1 above)
The second data set contains data from all households, including non-farming households and the income of each. How do I generate a variable that would say (as in the example above) if the household doesn’t farm=0, farms not apples=1 and farms apples=2.
I’m not sure how to get this information from one data set to another to generate the variable 'farmer' without doing it by hand for the rest of the households.
I think the answer might lie in generating a list through command 'levelsof' this is as much as I managed, but it didn't work: there's no error message but it doesn't replace any datapoints.
Any help would be very appreciated!!
My question in a nutshell:
How can I make a variable ‘farmer’ as shown in Data Set 2 with the information from Data Set 1? (Farmer being a variable that differentiates between non-apple farmers (like HH2), apple farmers (like HH1 and 4), and non-farmers (like HH3).
(End goal is to compare income among non-apple farmers, apple farmers and other households in Data Set 2.)
Sample Data:
Data Set 1:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str5 A str7 B str6 C "hh_id" "product" "amt_kg" "1" "apples" "10" "1" "bananas" "30" "1" "pears" "40" "2" "oranges" "50" "2" "grapes " "60" "2" "bananas" "70" "4" "grapes " "90" "4" "apples" "100" "4" "pears" "100" end
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str5 A str6(B C) "hh_id" "income" "farmer" "1" "1000" "2" "2" "1500" "0" "3" "2000" "1" "4" "3000" "2" end
More information:
I have two data sets from a household survey. The first one contains only farming data, and lists products households produce with multiple rows for each household. (See Data Set 1 above)
The second data set contains data from all households, including non-farming households and the income of each. How do I generate a variable that would say (as in the example above) if the household doesn’t farm=0, farms not apples=1 and farms apples=2.
I’m not sure how to get this information from one data set to another to generate the variable 'farmer' without doing it by hand for the rest of the households.
I think the answer might lie in generating a list through command 'levelsof' this is as much as I managed, but it didn't work: there's no error message but it doesn't replace any datapoints.
Code:
gen apple_farmer=1 if product==“apples” levelsof hh_id if apple_farmer==1, local(level) foreach level of local hh_id { replace apple_farmer=1 if hh_id=r(level) }
Comment