Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bubble plot experimenting

    Hello,

    I am experimenting with bubble plots in a panel dataset.

    How can I merge all the years into one scatterplot as an average?

    Is it possible to color some countries based on a binary variable in the data?


    Current command:

    Code:
    twoway (scatter Innovation KnowledgeCapitalFem if year==2019 & KnowledgeCapitalFem>1, sort msize(20-pt) mlabel(country) mlabposition(0) mfcolor(blue%30) mlcolor(black) mlwidth(medium) mlalign(inside))
    Click image for larger version

Name:	Bubbleplot.png
Views:	1
Size:	200.8 KB
ID:	1745365





  • #2
    Hey Nae,

    the solution is to combine multiple scatterplots (which you can do with twoway). That would look something like this:

    Code:
    twoway (scatter Innovation KnowledgeCapitalFem if year==2019 & KnowledgeCapitalFem>1 & group_variable == 1, ...) (scatter Innovation KnowledgeCapitalFem if year==2019 & KnowledgeCapitalFem>1 & group_variable == 2, ...)
    Best,
    Sebastian

    Comment


    • #3
      Hi Sebastian,

      Thanks for your response. Can you please elaborate? I am not sure I follow what you mean for group_variable == 1,...for each scatter plot.

      My variables are innovation on the Y axis and Knowledge Capital on the X axis for years 2019-2022. Is it possible to average the four years into a scatterplot?

      Comment


      • #4
        Hey Nae,

        I am sorry for the unclear response. My answer refers only to your second question "Is it possible to color some countries based on a binary variable in the data?". You can do so by generating a group_variable for the observations that need to be colored differently. If the circle for Austria was to be red and the German circle blue you would code something like this (minimum example):

        Code:
        gen group_variable = 1 if country=="Austria"
        replace group_varibale = 2 if country =="Germany"
        twoway (scatter Innovation KnowledgeCapitalFem if group_variable ==1 , mfcolor(red)) (scatter Innovation KnowledgeCapitalFem if group_variable ==2, mfcolor(blue))
        Regarding your first question: You can average the variables in your dataset for each country first and then scatter them.

        Code:
        bysort country: egen mean_KnowledgeCapitalFem = mean(KnowledgeCapitalFem)
        bysort country: egen mean_Innovation = mean(Innovation)
        preserve
        bysort country: keep if _n==1
        twoway (scatter mean_Innovation mean_KnowledgeCapitalFem ) (...)
        restore
        Best,
        Sebastian

        Comment


        • #5
          Both of them worked! Many thanks.

          Comment

          Working...
          X