Need to fix scale so that data can be seen better in a graph

Alejandra Gutierrez

Join Date: Mar 2024

Posts: 4
#1

Need to fix scale so that data can be seen better in a graph

15 Mar 2024, 21:33

hi everyone! I am trying to graph the Net International Migration per Year for 5 different cities. However, Vancouver's Net Int Migration is WAY higher than the other cities and I'm facing the issue of not figuring out how to scale my graph properly to showcase this. Someone recommended that I add another yaxis to showcase VAN levels but i am unsure if 1) if that's the best solution 2) how to add another yaxis that would do that.

thanks in advance!!

Attached Files

Append Net. Int Immigration.dta (6.4 KB, 1 view)
Tags: None
Paul Obermann

Join Date: Aug 2019

Posts: 7
#2

16 Mar 2024, 00:26

I am not sure what kind of graph you were picturing here. I agree with your intuition that a second y-axis is probably not the way to go. Overall, I think the problem may be what you are trying to graph rather than how to do it. If there is any way to convert your numbers to meaningful percentages, graphing will make a bit more sense. I came up with this, maybe it's a start for you:

Code:

levelsof ID, local(IDs) foreach id of local IDs { graph hbar NetInternationalImmigration if ID == "`id'", over(Year) name(`id') title("Net International Immigration - `id'") nolabel nodraw ytitle("Net Immigration") } graph combine KLWN KMLPS PG VAN, title("Net International Immigration by ID")

Note that I simply looped through your individual IDs which I get from 'levelsof', but you could simply write 4 separate lines, too.
By drawing the graphs in separate windows and then combining, you are forcing the same widths on the display, but you would have to draw attention to the axis with the numbers. It's pretty apparent that they have very different scaling though.

I also couldn't figure out how to redefine the x-axis (which is the y-axis here because using 'hbar'). They actually look nice when you draw the graphs individually, but get squished in the combine.

I know it's not much, but maybe a start.
Comment
Dirk Enzmann

Join Date: Apr 2014

Posts: 537
#3

16 Mar 2024, 03:50

Please note that (for obvious reasons, see FAQ #12.5) you should not post .dta-files. In order for others in the list to follow the exchange, please use dataex (FAQ # 12.2) to show us your data.
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#4

16 Mar 2024, 06:31

I agree with Dirk Enzmann. We ask that you don't give us .dta attachments and explain why not in detail.

It's a fair speculation that net immigration can be negative as well as positive, and if so logarithmic scale can't be the solution by itself, and solutions like cube roots or neglog

sign() * (abs())^(1/3)

sign() * log1p(abs())

are likely to be awkward too. From general geographical knowledge about British Columbia, as well as some Stata experience, I suspect you would be best off with

Code:

line immigration year, by(city, yrescale note(""))

where naturally you'll need to use the variable names you don't show us.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#5

16 Mar 2024, 09:19

Originally posted by Alejandra Gutierrez View Post

I am trying to graph the Net International Migration per Year for 5 different cities. However, Vancouver's Net Int Migration is WAY higher than the other cities and I'm facing the issue of not figuring out how to scale my graph properly to showcase this.

What is the principal point that you want your graph to make? Is it the contrast (or commonality) in the temporal pattern of international migration among these cities? Or is it the relative magnitude of Vancouver compared to the rest?

If the latter, then wouldn't a scalar, say, an interval average (in a bar plot), or just tabulating the values, do the job?

If the former, then you could scale the annual values by, say, the average over the period, and you can show the contemporaneous upticks and pandemic-related collapse and recovery, or any divergences from the pack by a city. I show that below using your attached dataset and the line colors you chose in your earlier thread.

Code:

version 18.0 clear * use "Append Net. Int Immigration.dta" quietly replace ID = strtrim(ID) generate int year = real(substr(Year, -4, .)) label variable year "Year Ending" sort ID year by ID: egen double mean = mean(NetInternationalImmigration ) generate double flux = NetInternationalImmigration / mean label variable flux "Annual Int'l Migration / Twenty-year Avarage" local plot line flux year if ID == #delimit ; graph twoway `plot' "VAN", lcolor(maroon) || `plot' "KMLPS", lcolor(green) || `plot' "KLWN", lcolor(pink) || `plot' "PG", lcolor(yellow) yline(0, lcolor(black)) scheme(s2color) ylabel( , angle(horizontal) nogrid) xlabel(2003(3)2022, angle(45)) legend(off); #delimit cr exit

If you want to show both the relative magnitude of Vancouver and the timecourse, then either something like Nick suggests above (separate line-plot graphs each with its own scale) or some manually or semi-manually created graph with a break in the y-axis. But with either of these, you'll need to accept that you're liable to weaken the ability of your audience to readily make comparisons of the features in the temporal profiles of the cities' international migration.
1 like
Comment

Announcement

Need to fix scale so that data can be seen better in a graph

Comment

Comment

Comment

Comment