Hello everyone,
Currently I am running a convergence regression. I want to include a variable on the accessibility of a region. The data I have contain several variables on accessibility, all which are driving time in minutes by car to the next high-way access, train station, airport or to several measures of agglomeration centers. However, some values are zero, thus indicating that there is on average no driving time.
The model I have is a log-log model. Of course, except for the dummy variables all variables have been converted to logarithmic function. However, of course when I transform the accessibility variables which contain a value of 0 it generates the 'missing values'.
My question is, what should I do?
A. Nothing and leave the 'missing values' out, but that would mean that all the regions which are very accessible are not included in the regression.
B. Replace all the missing values for 0, thus "replace lnACC_MAJOR = 0 if (lnACC_MAJOR == .)", however that does not seem like a good practice to me.
C. I should not convert the accessibility variables to logarithms.
D. Other, .....
Additionally, I want to include commuter flows, however I try to include net flow, which results in negative values for certain regions, thus higher outflow than inflow of commuters. However, negative values cannot be converted into logs. What should I do here?
Thanks in advance.
Currently I am running a convergence regression. I want to include a variable on the accessibility of a region. The data I have contain several variables on accessibility, all which are driving time in minutes by car to the next high-way access, train station, airport or to several measures of agglomeration centers. However, some values are zero, thus indicating that there is on average no driving time.
The model I have is a log-log model. Of course, except for the dummy variables all variables have been converted to logarithmic function. However, of course when I transform the accessibility variables which contain a value of 0 it generates the 'missing values'.
My question is, what should I do?
A. Nothing and leave the 'missing values' out, but that would mean that all the regions which are very accessible are not included in the regression.
B. Replace all the missing values for 0, thus "replace lnACC_MAJOR = 0 if (lnACC_MAJOR == .)", however that does not seem like a good practice to me.
C. I should not convert the accessibility variables to logarithms.
D. Other, .....
Additionally, I want to include commuter flows, however I try to include net flow, which results in negative values for certain regions, thus higher outflow than inflow of commuters. However, negative values cannot be converted into logs. What should I do here?
Thanks in advance.
Comment