Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Plotting predicted values by group

    Hello!

    I am hoping to get some insights for my analysis where I am plotting predicted values by group with CI showing for each group.

    I'm running a fixed effects model below for each age group separately (because these are time invariant, I cannot add them in the model as an independent variable, hence no margins or marinsplot) below.

    Code:
    foreach var of varlist onsetage_cat wrklimage_cat wrkprvage_cat tage_cat {
        forvalues i = 1(1)3 {
        qui reghdfe anyasset female i.race_eth educ married curr_emp dis hpov hhchildren rhnumper if `var'==`i' [pw=finpnl4], absorb(pid year) vce(cluster tehc_st)
        predict yhat`i'all_`var'
        
              }
    }
    This gives me predicted values for each observation and I want to make a graph showing predicted values for each age groups (onsetage_cat wrklimage_cat wrkprvage_cat tage_cat) that has three categories. So the y-axis is the predicted values and the x-axis is the three age groups. I can get upper/lower bounds for each observation, but I want to plot mean predicted values and CI for each age groups, not for each observation. My sample data are shown below.

    I really appreciate any comments in advance.

    Thank you very much!
    Hyun Ju


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float anyasset byte female float race_eth byte(educ married curr_emp dis) float(hpov hhchildren) byte rhnumper float(onsetage_cat wrklimage_cat wrkprvage_cat) byte tage_cat double finpnl4 float(pid year) str2 tehc_st
    1 1 1 1 0 0 0 0 0 1 . . . 3  6665.257622628558 1 2014 "20"
    1 1 1 1 0 0 0 0 0 1 . . . 3  6665.257622628558 1 2015 "20"
    1 1 1 1 0 0 0 0 0 1 . . . 3  6665.257622628558 1 2016 "20"
    1 1 1 1 0 0 0 0 0 1 . . . 3  6665.257622628558 1 2017 "20"
    1 1 . 1 1 1 0 0 1 5 . . . 2 20963.994881006787 2 2014 "20"
    1 1 . 1 1 1 0 0 1 5 . . . 2 20963.994881006787 2 2015 "20"
    1 1 . 1 1 0 1 0 1 5 . . . 2 20963.994881006787 2 2016 "20"
    1 1 . 1 1 0 0 0 1 4 . . . 3 20963.994881006787 2 2017 "20"
    1 0 . 0 1 1 0 0 0 5 . . . 3 25169.738975914108 3 2014 "20"
    1 0 . 0 1 1 0 0 0 5 . . . 3 25169.738975914108 3 2015 "20"
    1 0 . 0 1 1 1 0 0 5 . . . 3 25169.738975914108 3 2016 "20"
    1 0 . 0 1 1 0 0 0 4 . . . 3 25169.738975914108 3 2017 "20"
    1 1 . 1 0 0 0 0 0 5 . . . 1  22782.40487469756 4 2014 "20"
    0 1 . 1 0 0 0 0 0 5 . . . 1  22782.40487469756 4 2015 "20"
    1 1 . 1 0 1 0 0 0 5 . . . 2  22782.40487469756 4 2016 "20"
    1 1 . 1 0 1 0 0 0 4 . . . 2  22782.40487469756 4 2017 "20"
    1 0 . 1 0 1 0 0 0 5 . . . 1                  0 5 2014 "20"
    0 0 . 1 0 1 0 0 0 5 . . . 1                  0 5 2015 "20"
    1 0 . 1 0 1 0 0 0 5 . . . 1                  0 5 2016 "20"
    0 0 . . . . 0 0 0 5 . . . 1  19311.30721139168 6 2014 "20"
    end
    label values anyasset yesno
    label values hhchildren yesno
    label def yesno 0 "no", modify
    label def yesno 1 "yes", modify
    label values female female
    label def female 0 "male", modify
    label def female 1 "female", modify
    label values race_eth race_eth
    label def race_eth 1 "Non-Hispanic White", modify
    label values educ educ
    label def educ 0 "Less than highschool", modify
    label def educ 1 "Highschool or higher", modify
    label values married married
    label def married 0 "Unmarried", modify
    label def married 1 "Married", modify
    label values curr_emp curr_emp
    label def curr_emp 0 "Not working", modify
    label def curr_emp 1 "Working", modify
    label values dis dis
    label def dis 0 "no", modify
    label def dis 1 "yes", modify
    label values hpov pov
    label def pov 0 "Not in poverty", modify
    label values onsetage_cat onsetage_cat
    label values wrklimage_cat wrklimage_cat
    label values wrkprvage_cat wrkprvage_cat
    label values tage_cat tage_cat
    label def tage_cat 1 "Before age 26", modify
    label def tage_cat 2 "Ages 26-45", modify
    label def tage_cat 3 "After age 46", modify



  • #2
    This doesn't make sense to me. Because age group is not part of your regression model (and cannot be in a fixed-effects model) if there is any difference at all in the average predicted values for the three age groups, it will reflect nothing other than differences in the distributions of the other model variables in the three age-group categories. It is in no sense capturing any effect of age group, and I don't see how you could use a graph like the one you propose in ways that would not be misleading.

    If your research goals require you to capture effects of age group on your outcome variable, then you have to escape from the confines of fixed-effects modeling. Use the Mundlak correlated random effects model instead. If you're not sure how to implement that yourself, you can use -xthybrid-, available from SSC.

    Comment


    • #3
      Thanks very much for your comments, Clyde Schechter. I really appreciate it. My purpose is just to see if there's any different patterns (or levels) in the predicted values of having any assets (anyasset in the regression) among the three age groups, and hoping to show that in a graph. Just a demonstration purpose, not to capture any effects of age group. Would it be still misleading to plot average predicted values in three age groups when the main regression separately estimated 'anyasset' in those three age groups with fixed effects model?

      As for the Mundlak correlated random effects model, it looks great and interesting, but unfortunately I'm sort of stuck with fixed effects model in my field. But thank you for the suggestion.

      Comment


      • #4
        Let me put it this way. To what question is your proposed graph the answer? It does not answer the question "what is the modeled effect of age (group) on anyasset?" The only question I can see it answering is "given that people of different ages also differ in their distributions of [the right hand side variables in your model] and given that those variables have effects on anyasset, what is the indirect effect of age group on anyasset?" But does anybody really want to know the answer to that question? It is, in effect, half of a mediation model! The direct effects of age (group) on anyasset are not shown in that graph because the are not, and cannot be, estimated in that fixed effects model.

        If you can think of a way to show that graph with a title or notes that make it abundantly clear what it is, and that it is not what it appears to be, then go for it. But to my mind, anybody looking at a graph showing age groups on one axis and values plus error error bars of anyasset on the other is going to leap to the conclusion that you are showing modeled effects of age group on anyasset, and it's going to be very hard to dispel that.

        If the real question has to do with the effects of age (group) on anyasset, and if you really feel constrained to modeling using fixed effects, then I would say give up the modeling part. Just do a bar graph of anyasset vs age group directly from the data. That will be a mixture of direct and indirect effects and it will be an honest representation of the crude (unadjusted) effects of age group.

        Comment


        • #5
          Clyde Schechter, thank you very much for your comments. You're right on that the graph would be misleading to suggest any modeled effects of the age group on anyasset, which is not my intention. I'll think of other ways to present my findings. I appreciate for your helpful suggestions.

          Comment

          Working...
          X