Which regression? Continuous dependent variable between 0 and 1

Helmut Siegfried

Join Date: Jul 2018

Posts: 9
#1

Which regression? Continuous dependent variable between 0 and 1

01 Jul 2018, 13:14

Dear Statalist community,

I have two different dependent variables for which I need to run regressions.

My questions are:
- Which regressions are most suitable for both cases respectively
- Which is the best follow-up literature

I am new to Statalist. So any feedback, also on my posting style is appreciated.

The original sample provided around 340 observations for each dependent variable. (For graphical analysis of distribution see pictures attached at the end of post)

Unfortunately, by including several independent variables I reduced the sample size to 29 observations respectively.

1.:The observations for the first dependent variable are continuous with a lower bound of 0 and an upper bound of 1. An excerpt:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float DepVar_1
.009060956
.007225433
.007398274
.00996016
.008196721
.008309846
.03448276
.00947672
.023762377
.0046189376
end

2.: The observations of the second dependent variable are also continuous with a lower bound of 0. In contrast there is no upper bound. An excerpt:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float DepVar_2
13.19481
11.045174
15.830535
15.94977
13.91649
18.7512
12.101977
12.70576
9.620095
10.507575
end

Iam really grateful for any kind of support!
Helmut

Attachements:

Last edited by Helmut Siegfried; 01 Jul 2018, 13:31.
Tags: None
Richard Williams

Join Date: Apr 2014

Posts: 4945
#2

01 Jul 2018, 13:51

First off, if you are going from 340 cases down to 29, I would seriously re-assess my data and model. Why are so many cases missing? Is there one variable in particular that has a huge amount of MD? Would multiple imputation be an option? For basic and advanced MD techniques, see

https://www3.nd.edu/~rwilliam/stats3/MD01.pdf

https://www3.nd.edu/~rwilliam/stats3/MD02.pdf

For your 0/1 variable, some sort of fractional regression model may be appropriate. For some options, see

https://www3.nd.edu/~rwilliam/stats3...onseModels.pdf

OLS regression may be ok for the 2nd DV. Hard to say without knowing more about it, but a lower bound of 0 does not automatically concern me,

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
Helmut Siegfried

Join Date: Jul 2018

Posts: 9
#3

02 Jul 2018, 02:16

Dear Richard,

thank you for the quick reply and your feedback. I will certainly look into your recommendations.
The problem with the sample size origins from the merging of two datasets where the identifiers in the second dataset are very scarce.

Again thank you a lot!
Helmut
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35432
#4

02 Jul 2018, 03:47

For your second case I would check out Poisson regression. You don't have response values very close to zero but whenever a negative prediction would be absurd there can be value in avoiding that. The functional form can make more sense anyway. The response doesn't have to be a count. That's a common myth.

See e.g. https://blog.stata.com/2011/08/22/us...tell-a-friend/ (and some of the discussion).
Comment
Helmut Siegfried

Join Date: Jul 2018

Posts: 9
#5

02 Jul 2018, 05:10

Dear Nick,

thank you for this valuable additional feedback!

Helmut
Comment
Helmut Siegfried

Join Date: Jul 2018

Posts: 9
#6

10 Jul 2018, 07:00

Dear Richard and Nick,

please forgive for this follow up question.

I understand that fractional probit/logit use maximum likelihood estimation which has great properties for large samples.

Is fracreg also superior to regress in a small sample setting like mine?

Again thank you a lot for your response. Your help is greatly appreciated

Helmut
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35432
#7

10 Jul 2018, 07:32

I don't see sample size as an issue here. I wouldn't recommend/not recommend any of these methods as particularly good/bad for large/small samples.
Comment
Helmut Siegfried

Join Date: Jul 2018

Posts: 9
#8

10 Jul 2018, 10:34

As usual thanks for your valuable insights!
Comment

Announcement

Which regression? Continuous dependent variable between 0 and 1

Comment

Comment

Comment

Comment

Comment

Comment

Comment