Hello gang
This might be kind of a long thread, but here goes. I'm running a probit regression on survey data related to the effects of ads and content articles and im struggling being "sure" of how to interpret the results. The dependent variable is essentially the question: "Do you agree/disagree that the ads make you more positive towards [company xyz]?" and it is the variable Q16_engagementr5.
my first probit regression output is this:

Which looks fine. The data is coded such that the baseline is a "negative" answer like "strongly disagree/disagree" or "No" = 1, and strongly agree/agree or "Yes" = 2, and "neither"/"I dont know" = 3. This output does not tell me too much other than the significance of the coefficients and how the variables affect Y=1 by looking at the signs (+ -) in front of the coefficients. So i run the margins command:

Let's take Q16_engagementr3 (which is the question: "Do you find the ads to be original?") as an example: It has a P-value of 0.001 in the probit regression, indicating that the coefficient is significant.
Question 1: In interpreting the results on Q16r3, am i correct in saying that if a respondent find the ads to be original (Q16r3="yes"), there is ~5.5 percentage points higher probability that the respondent would say that the ads make him/her more positive towards company xyz (Q16r5="Yes"), compared to if the respondent did NOT find the ads to be original?
Now, if i run margins on some of the different variables i get this result:

I now see that Q16_engagementr7 has gone from a P-value of around 0.9 to being 0.000.
Question 2: Does this mean that if you answer the "Yes" or "Positive" alternative on Q16r7, the probability of saying that the ads make you more positive towards company xyz (Y=Q16r5) is ~72.6%?
Q3: Why is the P-value so high when running the probit regression and the margins dydx(*) command when it comes to the variable Q16_engagementr7, but its not when i just run the margins command with different variables? (significant vs insignificant?)
Also, i ran some goodness of fit tests and would very much appreciate any input/criticism of how i interpret the results:

Q4: Does the classification >=0.5 mean that if any probability in the relationship Pr(Y=1|X="Yes" or "Positive") is bigger or equal to 50% then it is classified as a success? And does the correctly classified result "97.75%" indicate the models ability to just correctly predict the outcomes when that is the case? Does it say anything about the predictive power of the model, or is this just about the classification?

Q5: I've read that the ROC curve measures how good the model discriminates between "success" and "not success". What does that in layterms mean, exactly? Is it related to the specification of P>=0.5, from the table above?

Q6: This test is clearly showing that there arent significant differences between the 10 subgroups in the data, and i can (safely) assume that i have a model that is correctly specified in terms of my data --> Is the score of almost 1 something to be critical of here? Are there any other interpretations i can make of this?
Q7: Any other input or advice you could give me in terms of showing the predictive power of the model or any other tests i could run, would be greatly appreciated
This might be kind of a long thread, but here goes. I'm running a probit regression on survey data related to the effects of ads and content articles and im struggling being "sure" of how to interpret the results. The dependent variable is essentially the question: "Do you agree/disagree that the ads make you more positive towards [company xyz]?" and it is the variable Q16_engagementr5.
my first probit regression output is this:
Which looks fine. The data is coded such that the baseline is a "negative" answer like "strongly disagree/disagree" or "No" = 1, and strongly agree/agree or "Yes" = 2, and "neither"/"I dont know" = 3. This output does not tell me too much other than the significance of the coefficients and how the variables affect Y=1 by looking at the signs (+ -) in front of the coefficients. So i run the margins command:
Let's take Q16_engagementr3 (which is the question: "Do you find the ads to be original?") as an example: It has a P-value of 0.001 in the probit regression, indicating that the coefficient is significant.
Question 1: In interpreting the results on Q16r3, am i correct in saying that if a respondent find the ads to be original (Q16r3="yes"), there is ~5.5 percentage points higher probability that the respondent would say that the ads make him/her more positive towards company xyz (Q16r5="Yes"), compared to if the respondent did NOT find the ads to be original?
Now, if i run margins on some of the different variables i get this result:
I now see that Q16_engagementr7 has gone from a P-value of around 0.9 to being 0.000.
Question 2: Does this mean that if you answer the "Yes" or "Positive" alternative on Q16r7, the probability of saying that the ads make you more positive towards company xyz (Y=Q16r5) is ~72.6%?
Q3: Why is the P-value so high when running the probit regression and the margins dydx(*) command when it comes to the variable Q16_engagementr7, but its not when i just run the margins command with different variables? (significant vs insignificant?)
Also, i ran some goodness of fit tests and would very much appreciate any input/criticism of how i interpret the results:
Q4: Does the classification >=0.5 mean that if any probability in the relationship Pr(Y=1|X="Yes" or "Positive") is bigger or equal to 50% then it is classified as a success? And does the correctly classified result "97.75%" indicate the models ability to just correctly predict the outcomes when that is the case? Does it say anything about the predictive power of the model, or is this just about the classification?
Q5: I've read that the ROC curve measures how good the model discriminates between "success" and "not success". What does that in layterms mean, exactly? Is it related to the specification of P>=0.5, from the table above?
Q6: This test is clearly showing that there arent significant differences between the 10 subgroups in the data, and i can (safely) assume that i have a model that is correctly specified in terms of my data --> Is the score of almost 1 something to be critical of here? Are there any other interpretations i can make of this?
Q7: Any other input or advice you could give me in terms of showing the predictive power of the model or any other tests i could run, would be greatly appreciated
Comment