Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Two-way graph for panel data

    Dear,

    I want to plot a variable in panel data.
    Below is an example of my data.

    Code:
    year province total district
    2003 A           62     1
    2003 B           65     2
    2003 C           70     3
    2004 A           45     1
    2004 B           50     2 
    2004 C           60     3
    2005 A           43     1
    2005 B           64     2
    2005 C           75     3
    I'd like to plot line graphs of the total toward year for each province on one page in different colors.

    When I use a command
    Code:
    tsline apt year, by(district)
    It generates three graphs.

    Would you help me with this topic?

    Best,

  • #2
    Here is some technique. I used code that should be fairly easy to extend to your real problem, which is likely to be more than 3 districts for 3 years.


    Code:
    clear 
    input year str1 province total district
    2003 A           62     1
    2003 B           65     2
    2003 C           70     3
    2004 A           45     1
    2004 B           50     2 
    2004 C           60     3
    2005 A           43     1
    2005 B           64     2
    2005 C           75     3
    end 
    
    * you need a colour for each district 
    tokenize "blue orange black"
    
    * 3 is the number of districts; change if needed 
    forval d = 1 / 3 { 
        * 2005 is the last year; change if needed 
        local call `call' line total year if district == `d', lc(``d'') || scatter total year if district == `d' & year == 2005, ms(none) mla(district) mlabc(``d'') mlabsize(large) || 
    }
    
    set scheme s1color 
    
    `call' , yla(, ang(h)) legend(off) xla(2003/2005) xsc(r(. 2005.2))
    Click image for larger version

Name:	manyline.png
Views:	1
Size:	30.1 KB
ID:	1674390



    Comment


    • #3
      Dear,

      I used the below code.

      Code:
      tokenize "blue orange black yellow green sand cyan pink lime navy emerald maroon olive red purple eltblue elt green dimgray gold gray orange_red ebg teal bluishgray"
      
      
      forval d = 1 / 25{
          local call `call' line apt year if district == `d', lc(``d'')|| scatter apt year if district == `d' & year == 2020, ms(none) mla(district) mlabc(``d'') mlabsize(large) ||
      }
      
      set scheme s1color
      `call' , yla(, ang(h)) legend(off) xla(2003/2020) xsc(r(. 2020.2))
      It says

      Code:
      ' is not a valid command name
      I modified the sample code as my data has 25 districts between 2003 and 2020.'
      But, it does not work.

      Comment


      • #4
        Which command triggers the error message?

        Comment


        • #5
          Can you reproduce #2?

          Comment


          • #6
            Yes, I could reproduce #2.
            But, I could not produce a graph when I used my dataset that contains 25 regions for each year from 2003 to 2020.

            Comment


            • #7
              The quotation marks should be

              Code:
              `  ' 
              or

              Code:
              `` '' 


              and the implication is that you have ' somewhere that ` should be. I can't see anything wrong with what is quoted in #3. In terms of ASCII characters

              Code:
              . mata : ascii("'")
                39
              
              . mata : ascii("`")
                96
              the punctuation should be

              96 stuff 39

              OR

              96 96 stuff 39 39




              Comment


              • #8
                I used the correct punctuation. But, it still does not work.

                The following is the error.

                Code:
                 `call' , yla(, ang(h)) legend(off) xla(2003/2020) xsc(r(. 2020.2))
                ' is not a valid command name

                Comment


                • #9
                  Code:
                  tokenize "blue orange black yellow green sand cyan pink lime navy emerald maroon olive red purple eltblue elt green dimgray gold gray orange_red ebg teal bluishgray"
                  
                  
                  forval d = 1 / 25 {
                      local call `call' line apt year if district == `d', lc(``d'')|| scatter apt year if district == `d' & year == 2020, ms(none) mla(district) mlabc(``d'') mlabsize(large) ||
                  }
                  
                  set scheme s1color
                  
                  `call' , yla(, ang(h)) legend(off) xla(2003/2020) xsc(r(. 2020.2))
                  This is the code I used.

                  Comment


                  • #10
                    I finally plot the data by using STATA SE. It did not work in STATA MP but it worked in STATA SE.

                    Thanks!

                    Comment


                    • #11
                      I am pleased you got the graph you wanted, but I see no reason whatsoever why using SE rather than MP would make a difference. On the contrary, I used MP for #2.

                      The explanation will be different. You had a mistake somewhere in the code you used for MP and not in the code for SE.

                      See also https://www.statalist.org/forums/help#spelling

                      Comment


                      • #12
                        Dear Nick and colleagues, I have the same issue as Kim (although 10y apart). The difference is that my 'district' variables are string. So, how could the above code be adjusted for string categories?

                        Comment


                        • #13
                          Nick Cox how about xtset and xtline without any loops?


                          Code:
                          clear
                          input float year str1 province float(total district)
                          2003 "A" 62 1
                          2004 "A" 45 1
                          2005 "A" 43 1
                          2003 "B" 65 2
                          2004 "B" 50 2
                          2005 "B" 64 2
                          2003 "C" 70 3
                          2004 "C" 60 3
                          2005 "C" 75 3
                          end
                          
                          xtset district year
                          xtline total, overlay leg(off) xlab(2003/2005)
                          Click image for larger version

Name:	Graph.png
Views:	1
Size:	79.5 KB
ID:	1768546

                          Comment


                          • #14
                            Originally posted by Mina Wu View Post
                            Dear Nick and colleagues, I have the same issue as Kim (although 10y apart). The difference is that my 'district' variables are string. So, how could the above code be adjusted for string categories?
                            Here is my suggestion.

                            Code:
                            clear
                            input float year str1 province float total str1 district
                            2003 "A" 62 "1"
                            2004 "A" 45 "1"
                            2005 "A" 43 "1"
                            2003 "B" 65 "2"
                            2004 "B" 50 "2"
                            2005 "B" 64 "2"
                            2003 "C" 70 "3"
                            2004 "C" 60 "3"
                            2005 "C" 75 "3"
                            end
                            
                            
                            egen id=group(district) , label
                            
                            xtset id year
                            
                            xtline total, overlay leg(off)

                            Comment


                            • #15
                              It can on occasion be true that you just want to see a collective pattern and don't need or wish to see identifiers. That could be true if you had say series for individual people and either the identifiers have no meaning or they should not be shown in public any way.

                              With say districts it seems to me that a researcher, or their readers, should want to care about which line is for which district. .

                              Backing up to #2: not using xtline and using a loop was motivated by the example and seeing that direct labelling would work well. There is no inherent virtue in using a loop, or in avoiding one if it is needed.

                              Experience or a little thinking shows that if the number of districts was say 10 direct labelling might still work well and so e superior to a legend. If it were say 30 direct labelling would not work well unless you split the series into groups, but a legend would not work well either. Say 100 and the only devices likely to work are one, two or even all three of

                              showing all the panels without a legend

                              highlighting a few of particular interest or importance

                              splitting the series into groups.

                              Comment

                              Working...
                              X