Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Modelling proportions over time?

    Dear Statalist users,

    I am looking at young people's market and non-market activities over time: employed FT, employed PT, unemployed, NILF, study etc. I have longitudinal data spanning sixteen years. I want to see how proportion say employed FT accounts for all young people's market and non-market activities in 2001, 2009 and 2016. I also want to see how different social demographic variables (e.g. partnership, presence of children) changes these proportions? How do I best go about doing this? Fixed effects doesn't seem appropriate...Fractional logistic regression perhaps using a series of dummies but that doesn't seem to work so well with longitudinal data...Any advice would be greatly appreciated.

    Data example

    input float activity_nostudy int year float(partnerships children)
    1 2016 0 1
    7 2014 1 1
    5 2013 0 1
    5 2011 0 1
    5 2012 0 1
    10 2007 1 1
    10 2010 0 1
    10 2008 0 1
    3 2015 1 1
    10 2009 0 1
    3 2005 0 0
    3 2014 1 1
    10 2012 1 1
    3 2009 0 0
    1 2008 0 0
    1 2011 1 0
    3 2010 1 0
    3 2013 1 1
    3 2016 1 1
    3 2015 1 1
    10 2012 0 1
    7 2003 0 1
    10 2001 0 1
    10 2004 0 1
    10 2002 0 1
    1 2003 1 0
    1 2012 0 0
    1 2005 1 0
    1 2008 1 0
    1 2009 1 0
    1 2010 1 0
    1 2013 0 0
    1 2004 1 0
    1 2016 0 0
    1 2011 0 0
    3 2002 1 0
    1 2007 1 0
    1 2006 1 0
    1 2015 0 0
    1 2014 0 0
    1 2013 0 0
    1 2014 0 0
    1 2009 0 0
    1 2016 0 0
    1 2008 0 0
    1 2010 0 0
    1 2011 0 0
    1 2015 0 0
    1 2012 0 0
    7 2015 0 0
    5 2016 0 0
    7 2014 0 0
    5 2013 0 0
    5 2012 0 0
    3 2001 1 1
    3 2008 0 0
    3 2011 0 0
    3 2012 1 1
    7 2009 0 0
    10 2015 1 1
    10 2014 1 1
    10 2016 1 1
    3 2013 1 1
    3 2010 0 0
    3 2011 1 0
    3 2015 1 1
    10 2014 1 1
    3 2016 1 1
    10 2012 1 1
    10 2013 1 1
    3 2016 0 0
    3 2015 0 0
    3 2014 0 0
    3 2013 0 0
    1 2001 0 0
    3 2002 0 0
    1 2003 0 0
    1 2010 0 0
    1 2012 0 0
    1 2014 0 0
    1 2007 0 0
    1 2008 0 0
    1 2011 0 0
    5 2009 0 0
    1 2013 0 0
    1 2016 0 0
    1 2015 0 0
    1 2013 1 0
    3 2009 0 0
    1 2012 1 0
    3 2010 0 0
    1 2011 0 0
    1 2016 1 0
    1 2014 1 0
    1 2015 1 0
    1 2016 1 0
    1 2011 0 0
    1 2013 1 0
    1 2015 1 0
    1 2014 1 0
    end
    label values activity_nostudy activity
    label def activity 1 "[1] Employed, full-time", modify
    label def activity 3 "[3] Employed, part-time", modify
    label def activity 5 "[5] Unemployed", modify
    label def activity 7 "[7] Not in the labour force", modify
    label def activity 10 "[10] Home-making / caring", modify
    [/CODE]

    Thanks as always
    Brendan

  • #2
    What you have is a categorical dependent variable, not a fraction. So a person in 2016 is recorded as either Employed full-time, employed part-time, etc. The person is recorded as 60% employed full-time, 30% employed part-time, etc. In principle the latter is possible, e.g. the percentage of days in a year that the person is employed full-time. In practice, I suspect that for most persons there is not enough variability to make that worth while, which is probably the reason that I haven't seen it done. Regardless, that information is not in your data, so this discussion is not relevant to you.

    This is not my field, but you will probably need some variation on a multinomial logit that takes the time aspect into account.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thanks Maarten for your response.

      Yes, they're categorical there but I have also been trying to modelling them as a series of dummy variables. If you were to say just perform an ordinary OLS regression on each of the dummy variables as separate models then the intercepts of each model would represent the proportion of each activity and would sum to 100. Obviously you can do this with a pooled regression but I'm worried about violating the assumptions of OLS using longitudinal data. Any ideas?

      Comment


      • #4
        Brendan: I agree with Maarten. You should be looking, I suggest, at some form of multinomial logit model, e.g. with random intercepts. Cf example 41g in the [SEM] Manual.

        In the economics literature, there are also panel data multinomial logit models such as the following (with References too)
        OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 80, 2 (2018) 0305–9049
        doi: 10.1111/obes.12197
        Low Paid Employment in Britain: Estimating State-Dependence and Stepping Stone Effects
        Lixin Cai, Kostas Mavromaras and Peter Sloane

        Comment

        Working...
        X