Quantile regression with categorical explanatory variables

Hugo Ferrer

Join Date: May 2022

Posts: 4
#1

Quantile regression with categorical explanatory variables

12 Nov 2024, 07:46

Hi everyone!

I am here posting some questions to see if I can get some help because I haven't been able to find answers to, despite everything posted on quantile regressions and related.

I begin with what I want to study: Given the increasing demand for organic foods, mainly due to health and environmental concerns, I want to study whether sociodemographic characteristics of households have different impact on those households with higher levels of organic/bio food products in their annual shopping basket compared to those with lower levels, and across which product groups these differences are observed. This info would be useful for designing target policies and strategies looking for promoting organic food consumption to certain sociodemographic groups and/or product types.

Now, I describe my data: I have a sample of roughly 1800 households observed annually over 4 years (2016-2019), with data about the share of ecological/bio food products in their annual shopping basket for several food groups (eg, fruits, meats, grains, dairy) and regions (eg, Region 1, Region 2, Region 3,...). The panel is unbalanced.

My endogenous variable is continuous but limited between 0 and 1, and it is the share of ecological/bio products in the annual shopping basket of households (ECOsh).

The explanatory variables are the sociodemographic characteristics of households, which are collected by categorical variables and a dummy variable. For example: age of shopper is defined as follows: < 30; 30 <= x < 60; >= 60; household members as 1-2, 3, 4+; activity status (employed/unemployed); number of children at home (0, 1, 2, 3+); annual income level (< 20000; 20000<= x < 30000; 30000<= x < 40000; 40000<= x < 50000; >= 50000). The preliminary statistical analysis point out clear heterogeneity of the data by each sociodemographic variable.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(EXPsh id_foodgroup id_region id_activity id_hhsize id_incomelevel id_age id_numchild) int Año .3711231 1 4 1 1 1 2 1 2019 .8186947 2 5 1 3 2 1 1 2017 .50880945 2 5 1 2 3 3 0 2019 .3795943 4 1 1 2 2 3 1 2016 .1115448 5 5 1 2 2 1 1 2018 .0852778 5 7 2 2 3 1 0 2019 .3037719 5 7 1 1 2 2 1 2019 .02551874 3 4 1 3 1 2 3 2018 .123654897 3 6 2 1 2 2 1 2017 .066647395 4 3 2 1 1 1 1 2016 .004051853 5 3 1 3 3 1 2 2017

My questions are the following:
Is it a panel quantile approach adequate for the purpose of the study? I looked up Hao and Naiman (2007) manual and saw an example of income and a categorical variable (education groups by years of schooling) and a dummy (reace group) variable in a cross section context, but found nothing in a panel context.

Given that the analysis is conducted by food group, could the xtqreg approach be applied here? The number of years is reduced, only 4. Or, would it be more appropriate to conduct a qreg analysis year by year?

if answer to number 2 is positive, then would it be necessary to apply a jackknife correction via boostrap?

Given that the endogenous variable is a rate, would it be advisable to use it in logs rather than treat it as a rate?

If the answer to question 2 is negative, and hence the quantile regression is not the best approach, what alternative model might work for this type of data to answer the main question? Note that the variability in the explanatory variables may be minimal along the sample period.

Thank you in advance!

Best,
Hugo
Tags: categorical, panel data, Quantile Regression, regression, xtqreg
Joao Santos Silva

Join Date: Apr 2014

Posts: 2982
#2

12 Nov 2024, 22:58

Dear Hugo Ferrer,

With T=4, I would not recommend a fixed effects panel approach. Maybe you just want use pooled QR with clustered standard errors (using qreg2)?

Best wishes,

Joao
Comment
Hugo Ferrer

Join Date: May 2022

Posts: 4
#3

13 Nov 2024, 02:04

Thank you Joao Santos Silva. I'll give a try!

Best,
Hugo
Comment

Announcement

Quantile regression with categorical explanatory variables

Comment

Comment