Regression model for first-differenced outcome and time-invariant regressors in a panel dataset

Zachary Abel

Join Date: Apr 2023

Posts: 6
#1

Regression model for first-differenced outcome and time-invariant regressors in a panel dataset

29 May 2024, 11:18

Dear StataList,

I have a panel dataset which in wave 1 asked respondents if they would do something (e.g., buy a new product), and in wave 2, asked respondents if they had done something (e.g., actually bought the product).
The additional information retained is largely time invariant, though some variation is observed. Variables include age, marital status, income quartile within the country, gender, and a number of the usual sociodemographic variables.

I have included an example of the data, which at present is formatted to pull intention information from wave 1, but otherwise use wave 2.

Of interest to me is a) How good is the survey at forecasting actual purchases, using intention to purchase alongside sociodemographic predictors. For this I plan to run a logistic regression on purchased, using intention and my regressors.
b) more interestingly (I think), identifying common characteristics of those who switch opinion versus those who maintain opinion. I am not interested in determining characteristics of those who are intent vs those who actually purchase.
Instead, I am interested in regressing on the transition matrix (1,1; 1,2; 2,1; 2,2), where x is intention (1 - I will purchase, 2 - I won't purchase), and y is behaviour (1 - purchased; 2 - did not purchase).

Regressing on the transition matrix is effectively a first difference model on outcome, however the time invariant regressors would be lost if I went with this option.
My present thinking is to use wave 2 data, and capture change from wave 1 (e.g., change in income quartile) as additional variables where variation over time exists.

I have been looking at multinomial logit models, and nested logit models. The primary difference being the IIA (independence of irrelevant alternatives) assumption.
Given that the same choice (to purchase or not to purchase) exists regardless of the intention in time t=1, I do not see the nested logit model as being appropriate (options are not mutually exclusive).

However, I am similarly a bit skeptical about the IIA assumption (and mlogtest provides different results under each test, as Freese and Scott Long have said) in the multinomial logit model, although in theory, this model should output what I am looking for (importance of characteristics on each state in the transition matrix). Ultimately, I am looking to identify if certain characteristics make people more likely to switch opinion (in either, or specific directions).

If anyone has any thoughts, advice, or could point me in a helpful direction - I'd be very appreciative.

Thanks!
Zac

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte(Intent Actual Age Female Country Incomequartile Maritalstatus) 0 0 38 1 2 1 1 0 0 34 1 1 4 1 0 1 62 0 2 1 1 0 0 68 1 1 2 1 1 0 66 0 2 4 0 0 0 52 0 3 3 0 1 0 49 0 3 2 1 0 1 65 1 2 2 0 1 0 68 1 2 4 0 0 1 67 0 1 3 1 0 0 19 0 3 3 1 0 1 45 0 2 2 0 1 0 58 0 2 1 0 1 1 21 1 3 2 0 0 0 51 1 2 3 0 0 1 69 0 1 3 0 0 1 44 0 1 4 1 0 0 48 1 2 1 0 1 0 53 1 3 4 0 end
Tags: None

Announcement

Regression model for first-differenced outcome and time-invariant regressors in a panel dataset