Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing means of subgroups

    Hi all,

    I am trying to build logits to measure gender employment gaps among lone parents in Canada in Feb 2020, in the Labour Force Survey data. I've created a dummy variable for employment (1=employed, or absent; 0=unemployed or not in labour force), and want to measure the difference (gap) in mean employment rates between subgroups.

    What syntax can I use to capture the difference in proportions of employment between male and female lone parents? How can I compare this gender gap among lone parents, e.g., within two sets of (ordinal) sub-groups: 1) older or younger child (<6 or 6-12) and 2) education?

    Please below an excerpt from dataex:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float lfs byte sex float(loneyg edu)
    0 1 . 1
    1 1 . 1
    1 1 . 0
    1 1 . 2
    1 1 . 1
    0 1 . 2
    1 1 . 0
    1 1 . 1
    1 1 . 1
    1 1 . 0
    1 1 . 1
    1 1 . 2
    1 1 . 1
    1 1 . 2
    1 1 . 2
    0 1 . 0
    1 1 . 0
    1 1 . 0
    1 1 . 1
    1 1 . 0
    1 1 . 1
    1 1 . 2
    1 1 . 2
    0 1 . 1
    1 1 . 1
    0 1 . 0
    1 1 . 1
    1 1 . 0
    1 1 . 1
    1 1 . 2
    1 1 . 1
    0 1 . 1
    1 1 . 1
    1 1 . 1
    1 1 . 0
    1 1 . 0
    1 1 . 1
    1 1 . 1
    1 1 . 1
    0 1 . 0
    1 1 . 0
    1 1 . 2
    1 1 . 2
    1 1 . 1
    1 1 . 0
    1 1 . 0
    0 1 . 0
    1 1 . 1
    1 1 . 1
    0 1 . 0
    1 1 . 1
    1 1 . 0
    1 1 . 2
    1 1 . 0
    1 1 . 1
    1 1 . 0
    1 1 . 1
    1 1 . 1
    1 1 . 0
    1 1 . 1
    1 1 1 1
    1 1 . 2
    1 1 . 0
    0 1 . 0
    1 1 . 1
    0 1 . 0
    1 1 . 2
    1 1 . 1
    1 1 . 1
    1 1 . 1
    1 1 . 1
    1 1 . 2
    1 1 . 2
    1 1 . 0
    1 1 . 2
    1 1 . 2
    1 1 . 1
    1 1 . 1
    1 1 . 1
    0 1 . 0
    1 1 . 2
    1 1 . 2
    1 1 . 0
    1 1 . 1
    0 1 . 1
    1 1 . 1
    1 1 . 1
    1 1 . 0
    1 1 . 2
    1 1 . 1
    1 1 . 1
    1 1 . 1
    1 1 . 2
    1 1 . 0
    0 1 . 2
    1 1 . 0
    1 1 . 2
    1 1 . 2
    1 1 . 1
    1 1 . 0
    end
    label values lfs lfs
    label def lfs 0 "not", modify
    label def lfs 1 "Employed", modify
    label values sex SEX
    label def SEX 1 "Male", modify
    label values loneyg loneyg
    label def loneyg 1 "Lone parents, yg child", modify
    label values edu edu
    label def edu 0 "(<)HS", modify
    label def edu 1 "some uni/college deg/trades", modify
    label def edu 2 "BA degree+", modify
    Last edited by Alex McIntosh; 01 Apr 2022, 17:09. Reason: Edited to exclude mention of analysis "over time"

  • #2
    Well, your example data is not suitable for this kind of analysis. You have no variable indicating children's age, and all of the participants are male. There is also no time variable, so there is no way to speak of what is happening "over time." On the assumption that these problems do not plague your full data set, and calling the variable indicating children's age children_age, you would do something like this:

    Code:
    logistic lfs i.sex##i.edu##i.children_age) if loneyg == 1
    margins edu#children_age, dydx(sex)
    Here I do not deal with the "over time" aspect of your question because I don't want to try to guess what you have in mind in the absence of better information.

    As an aside, you have this variable loneyg which is coded 1/missing. While this is often useful and commonly done in spreadsheets, this is a setup for errors in Stata. Dichotomous variables should almost always be coded 1 = true, 0 = false. Before you go astray, I suggest you recode loneyg accordingly.

    Comment


    • #3
      I'm sorry about my lack of proficiency with dataex. I'm clearly a novice in many aspects of Stata.

      I was originally trying to use these logits with an appended dataset (Feb to May 2020, with a variable "survmnth" for month), but I realized I first have to build meaningful logits for any given month, and so am trying to start a bit simpler. (I'm also perusing Chapter 18 on programming, in the documentation).

      The variable "loneyg" is meant to capture if the child of a lone parent is 0=<6 years old, or 1=6-12. So it is 0/1, but I'm not sure if this is a misuse of a dichotomous variable. In any case, this group is only n=1,887 for Feb, in a sample of n=45,708, so most cases are "missing" in my bungled dataex. In the analytical sample I'm trying to build (from the full, appended dataset) they are 6,814 / 168,792.

      When I use loneyg in the code suggested (with the appended dataset, from February to May, i.e. "2" to "5" in the variable survmnth)

      Code:
      logistic lfs i.sex##i.edu##i.loneyg
      margins edu#loneyg, dydx(sex)
      marginsplot
      I get: Click image for larger version

Name:	image_26736.jpg
Views:	1
Size:	31.0 KB
ID:	1657460

      I am wondering what syntax I need to stratify these marginal effects of sex by subgroups (like education, younger or older child), but with the x-axis corresponding to survmnth?

      Comment

      Working...
      X