How to regress a dependent variable that takes any value from -1 to +1?

Bill Bryant

Join Date: Apr 2020

Posts: 14
#1

How to regress a dependent variable that takes any value from -1 to +1?

05 May 2020, 15:50

I have a dataset with n=1,000 survey respondents who were asked to rate their preferences for various goods on a scale 0-100 points, for every good.

I created a new variable which assigns a "weighted points" to each good. The variable is weighted_points_A which is = (points given to good A)/(sum of all points given to all goods). The variable weighted_points_A is therefore a percentage, and I created an identical one for every good in the survey (i.e. good B, C, D, etc.).

I then created another variable which is diff_pref = weighted_points_A / weighted_points_B the purpose is to look at the proportional difference of preferences between good A and B. Obviously this variable takes values -1 to 1.

I now want to use this variable (diff_pref ) as a dependent variable in my regressions. What's the best econometric model (and Stata command) to do this? GLM?

Alternatively, I thought about taking the absolute value of this variable and then introduce a dummy that takes value of 1 when the difference is negative to control for the sign (which I lose when creating the absolute value), but not sure how good this is.

Any tip?

Thank you
Tags: dependent var, regression
Nick Cox

Join Date: Mar 2014

Posts: 35433
#2

05 May 2020, 16:36

Scale by (difference + 1) / 2 and apply logit GLM with binomial family and robust standard errors.
3 likes
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2121
#3

05 May 2020, 20:40

I agree with Nick. Or, you can use the command fracreg after transforming the way Nick suggests.
2 likes
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4374
#4

06 May 2020, 01:11

How about a multivariate linear regression of each of the original preference scores in a single, omnibus model (mvreg or manova)? You could then do contrasts (score differences) in terms of the original metric afterwards. I don't know what the best or most popular econometric model would be, but the proposed multistep transformation, coupled with the independent pairwise binomial regression models, seem like they throw away information, for example, if a respondent is unenthusiastic about any of the alternatives (zero or near-zero scores for all of the goods) or is very enthusiastic (100 or near-100 scores) about all of them.
2 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35433
#5

06 May 2020, 02:23

I don't disagree in any sense with @Joseph Coveney's thoughtful advice. I was responding more to the thread title than to the detail of #1. Supposing that

diff_pref = weighted_points_A / weighted_points_B

really means

diff_pref = weighted_points_A - weighted_points_B

there is a still a puzzle about whether there are yet other such measures for other pairs of weights and indeed a query about how many different goods are concerned here. But the difference is naturally not reversible, and loses a lot of detail that could be important. The difference between my enthusiasm for two statistics texts and between that for watching two football games could be similar, but my enthusiasm is not comparable.
1 like
Comment
Bill Bryant

Join Date: Apr 2020

Posts: 14
#6

06 May 2020, 12:10

Thanks everyone for your thoughtful comments.

The goods are comparable, so I'm not comparing apples and oranges. The variable "diff_pref" is indeed diff_pref = weighted_points_A - weighted_points_B ; sorry it was a typo, thanks for pointing that out Nick!

The approach you suggested (transforming the variable and then doing GML logit) works, but I am a bit worried I am losing some valuable information in the process.

Let's take two respondents, John and Mary. John assigned 80 points to both goods A and B (i.e., he's indifferent). Mary assigned 50 points to A and 60 to B. The variables values will be the following:

weighted_points_A is [80/(80+80)] 0.5 for John and 0.45 for Mary.
weighted_points_B is 0.5 for John and 0.54 for Mary.

diff_pref will be 0 for John and -0.495 for Mary.

When I transform this variable by doing [(difference + 1) / 2], the values become 0.5 for John and 0.2525 for Mary.

Let's say I want to know how preferences for goods A and B change as income changes. I regress the transformed dependent variable on income and find that this is significant. How should I interpret the sign and the coefficient of the variable income? Am I losing some important information by using the transformed dependent variable?

(Thank you all so much again for your comments)
Comment

Announcement

How to regress a dependent variable that takes any value from -1 to +1?

Comment

Comment

Comment

Comment

Comment