Hi everyone,
I have an outcome variable of 'expenditure' which I believe may be non-linear in nature so I am running a log - linear regression after taking log of the variable. However, the values under total expenditure which are 0 are transformed into missing values (.), which of course makes sense as you can't take log of 0. I am just unsure of what I should do in this situation? Should I just run the regression as though the variables are missing even though they actually are not? Should I replace them with 0?
After running the regression with the values as missing, the coefficient I got was highly significant with a p-value = 0! However, I am not sure if this should be concerning to me as I don't know whether it's realistic for me to be getting a p-value equal to 0 when I am using real-world data with more than 10,000 households? I wonder if my regression is not reliable because of the missing values in my data (which shouldn't be missing values, as I mentioned earlier?)
TIA!
I have an outcome variable of 'expenditure' which I believe may be non-linear in nature so I am running a log - linear regression after taking log of the variable. However, the values under total expenditure which are 0 are transformed into missing values (.), which of course makes sense as you can't take log of 0. I am just unsure of what I should do in this situation? Should I just run the regression as though the variables are missing even though they actually are not? Should I replace them with 0?
After running the regression with the values as missing, the coefficient I got was highly significant with a p-value = 0! However, I am not sure if this should be concerning to me as I don't know whether it's realistic for me to be getting a p-value equal to 0 when I am using real-world data with more than 10,000 households? I wonder if my regression is not reliable because of the missing values in my data (which shouldn't be missing values, as I mentioned earlier?)
TIA!
Comment