Hi All,
This might be a simple question but I just wanted to clarify how to handle 0 values where I have a numerical variable. Let's assume there are 100 farmers in my sample, 60 that only grow crop A and 40 that only grow crop B. In my dataset, the yield variable for those who grow crop A, will show a "missing" value for crop B and vice-versa. As an example, see the table below that shows a list of 4 farmers out of the sample.
In this case, if I were to include the yield variables in a regression, there would be a lot of data lost given that not all respondents grow both crops. Would it be correct to replace "." by "0s" given that these are not really missing values but the question just didn't apply for that farmer?
Thanks!
This might be a simple question but I just wanted to clarify how to handle 0 values where I have a numerical variable. Let's assume there are 100 farmers in my sample, 60 that only grow crop A and 40 that only grow crop B. In my dataset, the yield variable for those who grow crop A, will show a "missing" value for crop B and vice-versa. As an example, see the table below that shows a list of 4 farmers out of the sample.
Farmer | Crop grown | Yield A | Yield B |
1 | A | 5 | . |
2 | A | 10 | . |
3 | B | . | 9 |
4 | A | 8 | . |
Thanks!
Comment