Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graph not plotting all the data required

    Hi, can anyone help me figure out why my graph is not showing me all the data i need it to? When i check the data table it has all the numbers and everything matches to the code, however when it graphs it, It only shows the line graph for 2 of the cities and not all 5.

    I am trying to graph the population across time for the ID "VAN" ,"BC", "PG", "KMLPS", "KLWN"

    This is my code: twoway (line Population Year if ID=="VAN", color (maroon)) ///
    (line Population Year if ID== "BC", color (blue)) ///
    (line Population Year if ID== "KMLPS", color (green)) ///
    (line Population Year if ID== "KLWN", color (pink)) ///
    (line Population Year if ID== "PG", color (yellow)) ///

    Please help me its for a class and I have tried many variations of the code and nothing seems to work.
    Thanks!


  • #2
    Some possibilities:

    1. You have missing values for population, year or both.

    2. Your city names are different in the data. For example, any leading or trailing spaces added to the names or any use of lower case would cause Stata to decide it has nothing to plot.

    3. Your example is clearly for British Columbia, Vancouver, Kamloops and so forth. The lines for smaller places are going to be hard to tell from each other unless you use logarithmic scale.

    Comment


    • #3
      Hi Nick,

      Thanks for answering! Both your comments for 1 &2 I have already fixed and all the values have been accounted for and the names are all the same too.
      I THINK the issue might be that because i'm looking at population and that Vancouver and BC's are so much larger than the other cities and then the graph doesn't show it with the other values (not sure); however when i go graph only VAN it just shows me an empty graph and there is values in every year so not sure what's going on.

      Someone suggested i use the bysort command to calculate the group averages per year any advice on how to use that command?

      Comment


      • #4
        Originally posted by Alejandra Gutierrez View Post
        . . . not sure what's going on.
        You might be better off (that is, get to a resolution faster and without so much back-and-forth) by attaching your dataset, or at least that subset containing the Population, Year and ID variables. If the number of observations is fewer than a hundred, you could use dataex.

        Otherwise, you could try something like the following.
        Code:
        generate str id = strlower(strtrim(ID))
        foreach id in van bc pg kmlps klwn {
            summarize Year if id == "`id'" & !missing(Population), meanonly
            display in smcl as text "`id': " as result r(N) // inspect in order to make sure none are zero
            display in smcl as text "`id' range of years: " as result r(min), r(max) // and no unexpected discrepancies in census period
        }
        
        // If okay, then
        local stuff Population Year if id ==
        #delimit ;
        sort id Year;
        graph twoway ///
            line `stuff' "van", lcolor(maroon) ||
            line `stuff' "bc", lcolor(blue) ||
            line `stuff' "kmlps", lcolor(green) ||
            line `stuff' "klwn", lcolor(pink) ||
            line `stuff' "pg", lcolor(yellow), yscale(log);
        #delimit cr

        Comment


        • #5
          Joseph Coveney gives excellent advice. Although you’re denying it I still suspect some error in specifying observations.

          Comment


          • #6
            What was the resolution here? You started another thread without summarizing the outcome here.

            Comment

            Working...
            X