Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a "baseline" data group for comparison and linear graphs

    Dear Statalists,

    im trying to create a graph (multiple graphs at the end) which should look something like this:
    Click image for larger version

Name:	Example.JPG
Views:	1
Size:	7.0 KB
ID:	1739302

    The straight horizontal line would be my "baseline" group and the other line depicts how the other group changes over age in comparison.

    My dataset contains 4 variables (sex age meanwork minworkage) with 70 observations and contains information on the average of children working at different ages sorted by sex and the legal minimum working age from the region they are living in.
    My Goal is to set one of the regions as a “baseline” (the one where minworkage==14) and then analyse how the percentage of working children at different ages deviates from that in other regions, where there is a different minimum working age (supposedly the legal working age should affect the amount of children working).

    My first attempt was to create the graph directly, however I couldn’t find any way to set one group as “baseline”.

    The next idea was to create a new variable first, which simply contains the difference between the meanwork in region x and the meanwork in region (minworkage=14) per sex and age:

    replace diff_working_children= work-work[15] if age==10 & sex==1 /* observation [15] contains age==10 from minworkage==14 for sex==1

    The command works, but I have to manually change and repeat it for every age and gender for the whole table. I couldn’t find a way for the “work[15]” to automatically choose the correct comparison age from group (minworkage==14).
    Is there a more efficient way to create the new variable in one line of code?

    I added the data sample as an attachment (the variable "diff_working_children" already contains some nonsense data from my experiments).

    Thank you in advance for your help!
    Sincerely,
    Adrian
    Attached Files

  • #2
    Code:
    by age sex, sort: egen base_work =max(cond(minworkage == 14, work, .))
    gen diff_working_children = work-base_work
    Note: This code assumes that for each age and sex combination there is only one observation having minworkage == 14. If there is more than one such, then that non-uniqueness precludes using the criterion minworkage == 14 to define the baseline.

    Please read the Forum FAQ for excellent advice about how to maximize your chances of getting timely and helpful responses here. Among the things you will learn there is that attachments are discouraged. Some Forum members simply will not risk downloading attachments from people they don't know. The best way to show example data, which you should nearly always do when asking for help with code, is by using the -dataex- command. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
    Last edited by Clyde Schechter; 09 Jan 2024, 14:23.

    Comment


    • #3
      Thank you so much for the quick and helpful answer! The code works absolutely perfect, that would have taken me ages (if at all) to figure out.
      And thank you for the advice on attachments, i will keep that in mind for future questions and use -dataex- accordingly.
      Last edited by Adrian Beck; 09 Jan 2024, 14:40.

      Comment

      Working...
      X