Assumptions checking

ALKEBSEE RADWAN

Join Date: Mar 2019
Posts: 240

Assumptions checking

17 Jan 2022, 09:45

Hell everyone
I want to check the model assumptions when is my dependent variable is categorical variable (1, 2 3 4)
there is the example of my data

HTML Code:

[CODE]
* Example generated by -dataex-. To install: ssc install dataex
clear
input double(code year) float log_ID_pay double disclosure_quality
  2299 2011   11.0021 4
  2086 2014   11.0021 3
   998 2008  10.47107 2
  2458 2014    10.859 3
  2041 2005  9.769957 2
   735 2017 11.289782 2
  2746 2016 10.308952 4
  2772 2016 11.407565 3
  2069 2014 11.275303 2
   998 2013   11.0021 3
  2200 2016 11.512925 3
   998 2017 10.785876 3
  2321 2016 10.532096 3
   998 2015 10.308952 3
  2299 2017   11.0021 4
   735 2010   11.0021 3
300106 2014 10.308952 3
   735 2014 11.289782 2
  2041 2015   11.0021 3
  2696 2014 10.819778 3
   998 2016 10.397506 3
  2679 2013 10.819778 3
  2679 2012 10.596635 3
300087 2014 10.322198 3
   998 2010 10.778956 3
   798 2015 11.289782 2
300106 2011 10.596635 3
  2321 2015 10.819778 3
  2477 2017 11.695247 2
300087 2017   11.0021 3
   592 2012 11.184422 3
   798 2010 10.414313 3
  2772 2015 10.778956 3
  2069 2008   11.0021 3
300087 2012 10.596635 3
   735 2012 10.778956 2
   798 2011 11.289782 3
   735 2016 10.596635 3
  2696 2012 10.819778 3
   798 2013 11.289782 2
300498 2016 11.226243 3
  2086 2017 10.714417 3
  2200 2011 11.138232 1
  2234 2015 10.819778 2
   592 2004 10.308952 3
  2234 2013 10.819778 3
  2069 2015 11.289782 1
  2477 2014 11.652687 3
  2679 2015 10.534094 1
  2696 2016  9.852194 3
  2234 2017 10.819778 3
  2041 2016   11.0021 3
   769 2004 10.491274 2
  2299 2014  10.79446 4
300189 2015 11.289782 2
  2458 2013  9.903487 3
300189 2016 11.289782 1
  2234 2011 10.308952 3
  2746 2015 10.596635 3
  2069 2012 11.289782 3
  2086 2015   11.0021 3
   798 2014 10.596635 3
   592 2015 11.184422 3
  2696 2017   11.0021 3
  2041 2011   11.0021 3
  2086 2009 10.308952 3
300189 2011 10.341743 3
   592 2013 11.184422 3
  2477 2011   11.0021 3
300106 2012 10.596635 3
   798 2017 11.184422 2
   769 2002 10.491274 2
  2458 2017 10.778956 3
  2458 2011 10.596635 3
   713 2008 10.819778 3
300189 2013 11.289782 3
300106 2015 10.308952 3
  2200 2014 11.512925 3
  2041 2006  10.12663 2
300498 2015 10.463623 3
300094 2015   11.0021 2
   798 2004 10.819778 2
  2234 2016 10.308952 3
  2200 2015 11.225244 3
   998 2004 10.778956 3
   998 2006 10.555813 3
   998 2011 10.203592 3
  2321 2013 10.819778 3
   592 2008 10.532096 2
300189 2017 11.289782 1
   592 2017 11.184422 3
   713 2015 10.819778 3
  2069 2016 10.965436 2
  2299 2016 10.714417 4
  2299 2012 10.308952 3
  2299 2010 10.943765 4
  2069 2009   11.0021 3
  2299 2015   11.0021 3
  2234 2012 10.819778 3
  2041 2010 10.853213 3
end
[/CODE]

So what test of residuals can I use for this dependent variable ?
please help

I have used P-P plot the plot looks like a snake

please help

Tags: None

ALKEBSEE RADWAN

Join Date: Mar 2019

Posts: 240
#2

17 Jan 2022, 09:49

please look at the plot of my model
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#3

17 Jan 2022, 09:57

What's the model you fitted? What command did you issue? I guess you're treating the categorical variable as if it were measured or counted.

If the outcome variable is discrete, some multimodality in the residuals is only to be expected. You can't expect a close fit to a normal distribution.

Your plot is not a P-P plot. It is a normal quantile plot (normal probability plot, normal scores plot, probit plot). The term quantile-quantile plot or Q-Q plot could be used.

Last edited by Nick Cox; 17 Jan 2022, 10:12.
Comment
ALKEBSEE RADWAN

Join Date: Mar 2019

Posts: 240
#4

17 Jan 2022, 10:05

Originally posted by Nick Cox View Post

What's the model you fitted? What command did you issue, I guess you're treating the categorical variable as if it were measured or counted.

If the outcome variable is discrete, some multimodality in the residuals is only to be expected. You can't expect a close fit to a normal distribution.

Your plot is not a P-P plot. It is a normal quantile plot (normal probability plot, normal scores plot, probit plot). The term quantile-quantile plot or Q-Q plot could be used.

Firstly, thank you for replying

secondly, I have use OLS regression. my variable is disclosure quality which has four levels 4= excellent, 3, 2, and 1= weak.

Thirdly, you said "You can't expect a close fit to a normal distribution". Does that mean I am safe if I include this diagram as a method for testing the hypothesis assumptions (test of residuals)?

please explain
Comment
ALKEBSEE RADWAN

Join Date: Mar 2019

Posts: 240
#5

17 Jan 2022, 10:08

Sorry I use the following code

HTML Code:

predict resid_dq, residuals qnorm resid_dq,
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#6

17 Jan 2022, 10:11

Thanks for the details, but you haven't answered my question fully. What predictor variables did you use?

Otherwise there is no "safe" here independently of who is judging this and what criteria they will use. I won't be examining or reviewing your work but I would want a serious discussion of why you are using plain regression rather than say ordinal logit.

OLS is an estimation procedure, not a model flavour.

Here's an analogue of what you may have done which you can reproduce. The last trick of showing the outcome values on your normal quantile plot may help your interpretation.

Code:

sysuse auto regress rep78 price weight predict res, res qnorm res qnorm res, ms(none) mla(rep78) mlabpos(0)
Comment
ALKEBSEE RADWAN

Join Date: Mar 2019

Posts: 240
#7

17 Jan 2022, 10:37

Originally posted by Nick Cox View Post

Thanks for the details, but you haven't answered my question fully. What predictor variables did you use?

Otherwise there is no "safe" here independently of who is judging this and what criteria they will use. I won't be examining or reviewing your work but I would want a serious discussion of why you are using plain regression rather than say ordinal logit.

OLS is an estimation procedure, not a model flavour.

Here's an analogue of what you may have done which you can reproduce. The last trick of showing the outcome values on your normal quantile plot may help your interpretation.

Code:

sysuse auto regress rep78 price weight predict res, res qnorm res qnorm res, ms(none) mla(rep78) mlabpos(0)

my dependent variable is disclosure quality and my independent variable is the logarithm of executives compensation and several control variable.
After running the OLS regression, I sue the commands aforementioned.

I actually don't understand what is the predictor. maybe I am not fully aware of this test.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#8

17 Jan 2022, 10:41

Predictor variables as I use the term are any variables other than the outcome variable. You may want to distinguish independent and control variables for your readers but regress doesn't care.
Comment
ALKEBSEE RADWAN

Join Date: Mar 2019

Posts: 240
#9

17 Jan 2022, 10:43

Dear @Nick Cox
Actually I have run the codes you gave above. and I got the following plot.
Does this plot suggest that the model is trustworthy based on the results of residuals test?
I actually see this example similar to mine (the curve of the plot)
Comment
ALKEBSEE RADWAN

Join Date: Mar 2019

Posts: 240
#10

17 Jan 2022, 10:53

Originally posted by Nick Cox View Post

Predictor variables as I use the term are any variables other than the outcome variable. You may want to distinguish independent and control variables for your readers but regress doesn't care.

one more thing:
I need to add a title in the following command

HTML Code:

qnorm resid_dq, ms(none) mla(disclosure_quality) mlabpos(0)

please how ?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#11

17 Jan 2022, 11:07

Whatever you used for #2 to get a title (a title() option, presumably) but it's better to use a shorter title.

Sorry but I can't say anything different about "trustworthy" than about "safe".
1 like
Comment

Announcement

Assumptions checking

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment