Hi everyone!
I currently struggling with my dataset and the multiple regression I would like to do as there are certain assumptions which have to be met before (listed below).
The problem that I'm facing is that I have worked with SPSS before and have no clue how to check for these assumptions in Stata. And currently I'm kind of devasteded as I'm not even sure this is the right way to test my set of data statistically.
For example I was looking for the scatterplots between the DV and one of the IV - but as they were all dummyV the plots looked wired.
I did some research and it is said that Stata has all the tools to check the assumptions but I don't know were to start and how to check all assumptions so I don't make any mistake.
I just added a picture of the variables. There are two more dummyV - but I guess the approach will be the same for them.
I would be glad for any advice as it is hard to find any good source about dummy variables and regressions.
I currently struggling with my dataset and the multiple regression I would like to do as there are certain assumptions which have to be met before (listed below).
- Assumption: You should have independence of observations (i.e., independence of residuals), which you can check in Stata using the Durbin-Watson statistic.
- Assumption: There needs to be a linear relationship between (a) the dependent variable and each of your independent variables, and (b) the dependent variable and the independent variables collectively. You can check for linearity in Stata using scatterplots and partial regression plots.
- Assumption: Your data needs to show homoscedasticity, which is where the variances along the line of best fit remain similar as you move along the line. You can check for homoscedasticity in Stata by plotting the studentized residuals against the unstandardized predicted values.
- Assumption: Your data must not show multicollinearity, which occurs when you have two or more independent variables that are highly correlated with each other. You can check this assumption in Stata through an inspection of correlation coefficients and Tolerance/VIF values.
- Assumption: There should be no significant outliers, high leverage points or highly influential points, which represent observations in your data set that are in some way unusual. These can have a very negative effect on the regression equation that is used to predict the value of the dependent variable based on the independent variables. You can check for outliers, leverage points and influential points using Stata.
- Assumption: The residuals (errors) should be approximately normally distributed, which you can check in Stata using a histogram (with a superimposed normal curve) and Normal P-P Plot, or a Normal Q-Q Plot of the studentized residuals.
The problem that I'm facing is that I have worked with SPSS before and have no clue how to check for these assumptions in Stata. And currently I'm kind of devasteded as I'm not even sure this is the right way to test my set of data statistically.
For example I was looking for the scatterplots between the DV and one of the IV - but as they were all dummyV the plots looked wired.
I did some research and it is said that Stata has all the tools to check the assumptions but I don't know were to start and how to check all assumptions so I don't make any mistake.
I just added a picture of the variables. There are two more dummyV - but I guess the approach will be the same for them.
I would be glad for any advice as it is hard to find any good source about dummy variables and regressions.
Comment