Hello everyone,
I am asking for support for the following tasks. I received a dataset that includes the variables yrs of education, yrs of experience, yrs of experience squared, amount of hours worked and several dummy variables (black, hispanic, union, married).
Firstly, I need to choose which variables should be included in my regression of logwage (wage rate per hour). Do I just think logical here? E.g. more school --> higher wage rate --> include school variable? Or should I just plot it?
Secondly, we need to investigate correlations between them. What do these correlations tell me in order to decide which variables to include?
Which ones could be good interaction terms and how do i interpret them?
Finally, we have to create different models and have to compare them with our basic model. Please give me hints of what could be changed in these models?
Thank you very much,
TS
I am asking for support for the following tasks. I received a dataset that includes the variables yrs of education, yrs of experience, yrs of experience squared, amount of hours worked and several dummy variables (black, hispanic, union, married).
Firstly, I need to choose which variables should be included in my regression of logwage (wage rate per hour). Do I just think logical here? E.g. more school --> higher wage rate --> include school variable? Or should I just plot it?
Secondly, we need to investigate correlations between them. What do these correlations tell me in order to decide which variables to include?
Which ones could be good interaction terms and how do i interpret them?
Finally, we have to create different models and have to compare them with our basic model. Please give me hints of what could be changed in these models?
Thank you very much,
TS
Comment