Modelling proportions over time?

Brendan Churchill

Join Date: Oct 2016

Posts: 41
#1

Modelling proportions over time?

06 Jun 2018, 20:53

Dear Statalist users,

I am looking at young people's market and non-market activities over time: employed FT, employed PT, unemployed, NILF, study etc. I have longitudinal data spanning sixteen years. I want to see how proportion say employed FT accounts for all young people's market and non-market activities in 2001, 2009 and 2016. I also want to see how different social demographic variables (e.g. partnership, presence of children) changes these proportions? How do I best go about doing this? Fixed effects doesn't seem appropriate...Fractional logistic regression perhaps using a series of dummies but that doesn't seem to work so well with longitudinal data...Any advice would be greatly appreciated.

Data example

input float activity_nostudy int year float(partnerships children)
1 2016 0 1
7 2014 1 1
5 2013 0 1
5 2011 0 1
5 2012 0 1
10 2007 1 1
10 2010 0 1
10 2008 0 1
3 2015 1 1
10 2009 0 1
3 2005 0 0
3 2014 1 1
10 2012 1 1
3 2009 0 0
1 2008 0 0
1 2011 1 0
3 2010 1 0
3 2013 1 1
3 2016 1 1
3 2015 1 1
10 2012 0 1
7 2003 0 1
10 2001 0 1
10 2004 0 1
10 2002 0 1
1 2003 1 0
1 2012 0 0
1 2005 1 0
1 2008 1 0
1 2009 1 0
1 2010 1 0
1 2013 0 0
1 2004 1 0
1 2016 0 0
1 2011 0 0
3 2002 1 0
1 2007 1 0
1 2006 1 0
1 2015 0 0
1 2014 0 0
1 2013 0 0
1 2014 0 0
1 2009 0 0
1 2016 0 0
1 2008 0 0
1 2010 0 0
1 2011 0 0
1 2015 0 0
1 2012 0 0
7 2015 0 0
5 2016 0 0
7 2014 0 0
5 2013 0 0
5 2012 0 0
3 2001 1 1
3 2008 0 0
3 2011 0 0
3 2012 1 1
7 2009 0 0
10 2015 1 1
10 2014 1 1
10 2016 1 1
3 2013 1 1
3 2010 0 0
3 2011 1 0
3 2015 1 1
10 2014 1 1
3 2016 1 1
10 2012 1 1
10 2013 1 1
3 2016 0 0
3 2015 0 0
3 2014 0 0
3 2013 0 0
1 2001 0 0
3 2002 0 0
1 2003 0 0
1 2010 0 0
1 2012 0 0
1 2014 0 0
1 2007 0 0
1 2008 0 0
1 2011 0 0
5 2009 0 0
1 2013 0 0
1 2016 0 0
1 2015 0 0
1 2013 1 0
3 2009 0 0
1 2012 1 0
3 2010 0 0
1 2011 0 0
1 2016 1 0
1 2014 1 0
1 2015 1 0
1 2016 1 0
1 2011 0 0
1 2013 1 0
1 2015 1 0
1 2014 1 0
end
label values activity_nostudy activity
label def activity 1 "[1] Employed, full-time", modify
label def activity 3 "[3] Employed, part-time", modify
label def activity 5 "[5] Unemployed", modify
label def activity 7 "[7] Not in the labour force", modify
label def activity 10 "[10] Home-making / caring", modify
[/CODE]

Thanks as always
Brendan
Tags: None
Maarten Buis

Join Date: Mar 2014

Posts: 3456
#2

07 Jun 2018, 01:29

What you have is a categorical dependent variable, not a fraction. So a person in 2016 is recorded as either Employed full-time, employed part-time, etc. The person is recorded as 60% employed full-time, 30% employed part-time, etc. In principle the latter is possible, e.g. the percentage of days in a year that the person is employed full-time. In practice, I suspect that for most persons there is not enough variability to make that worth while, which is probably the reason that I haven't seen it done. Regardless, that information is not in your data, so this discussion is not relevant to you.

This is not my field, but you will probably need some variation on a multinomial logit that takes the time aspect into account.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Brendan Churchill

Join Date: Oct 2016

Posts: 41
#3

07 Jun 2018, 04:16

Thanks Maarten for your response.

Yes, they're categorical there but I have also been trying to modelling them as a series of dummy variables. If you were to say just perform an ordinary OLS regression on each of the dummy variables as separate models then the intercepts of each model would represent the proportion of each activity and would sum to 100. Obviously you can do this with a pooled regression but I'm worried about violating the assumptions of OLS using longitudinal data. Any ideas?
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#4

07 Jun 2018, 07:27

Brendan: I agree with Maarten. You should be looking, I suggest, at some form of multinomial logit model, e.g. with random intercepts. Cf example 41g in the [SEM] Manual.

In the economics literature, there are also panel data multinomial logit models such as the following (with References too)
OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 80, 2 (2018) 0305–9049
doi: 10.1111/obes.12197
Low Paid Employment in Britain: Estimating State-Dependence and Stepping Stone Effects
Lixin Cai, Kostas Mavromaras and Peter Sloane
Comment

Announcement

Modelling proportions over time?

Comment

Comment

Comment