Creating a "baseline" data group for comparison and linear graphs

Adrian Beck

Join Date: Jan 2024

Posts: 5
#1

Creating a "baseline" data group for comparison and linear graphs

09 Jan 2024, 13:58

Dear Statalists,

im trying to create a graph (multiple graphs at the end) which should look something like this:

The straight horizontal line would be my "baseline" group and the other line depicts how the other group changes over age in comparison.

My dataset contains 4 variables (sex age meanwork minworkage) with 70 observations and contains information on the average of children working at different ages sorted by sex and the legal minimum working age from the region they are living in.
My Goal is to set one of the regions as a “baseline” (the one where minworkage==14) and then analyse how the percentage of working children at different ages deviates from that in other regions, where there is a different minimum working age (supposedly the legal working age should affect the amount of children working).

My first attempt was to create the graph directly, however I couldn’t find any way to set one group as “baseline”.

The next idea was to create a new variable first, which simply contains the difference between the meanwork in region x and the meanwork in region (minworkage=14) per sex and age:

replace diff_working_children= work-work[15] if age==10 & sex==1 /* observation [15] contains age==10 from minworkage==14 for sex==1

The command works, but I have to manually change and repeat it for every age and gender for the whole table. I couldn’t find a way for the “work[15]” to automatically choose the correct comparison age from group (minworkage==14).
Is there a more efficient way to create the new variable in one line of code?

I added the data sample as an attachment (the variable "diff_working_children" already contains some nonsense data from my experiments).

Thank you in advance for your help!
Sincerely,
Adrian

Attached Files

240109_minworkage_preliminary table.dta (5.4 KB, 1 view)
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29956
#2

09 Jan 2024, 14:20

Code:

by age sex, sort: egen base_work =max(cond(minworkage == 14, work, .)) gen diff_working_children = work-base_work

Note: This code assumes that for each age and sex combination there is only one observation having minworkage == 14. If there is more than one such, then that non-uniqueness precludes using the criterion minworkage == 14 to define the baseline.

Please read the Forum FAQ for excellent advice about how to maximize your chances of getting timely and helpful responses here. Among the things you will learn there is that attachments are discouraged. Some Forum members simply will not risk downloading attachments from people they don't know. The best way to show example data, which you should nearly always do when asking for help with code, is by using the -dataex- command. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

Last edited by Clyde Schechter; 09 Jan 2024, 14:23.
1 like
Comment
Adrian Beck

Join Date: Jan 2024

Posts: 5
#3

09 Jan 2024, 14:29

Thank you so much for the quick and helpful answer! The code works absolutely perfect, that would have taken me ages (if at all) to figure out.
And thank you for the advice on attachments, i will keep that in mind for future questions and use -dataex- accordingly.

Last edited by Adrian Beck; 09 Jan 2024, 14:40.
Comment

Announcement

Creating a "baseline" data group for comparison and linear graphs

Comment

Comment