Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Building a Scatterplot from Dummy Variables

    I am attempting to build a scatterplot using dummy variables. Of course, simply plotting the dummies as a scatterplot is not helpful. So, what I would like to do is plot the percentage of observations with a 1.

    I have a panel set of observations with unique household indicators, the state they live in, the year of the survey, and whether they take up food stamps in that year (this is the dummy. 1 = yes, 0 = no). I'd like to plot the percentages of each state of SNAP take up by year and state (eg. Idaho in year 2003 is X% for one point, MD in year 2003 is Y% as another, etc). How should I go about doing this? I'm not sure how to calculate the percentage of the dummy from the panel data.

  • #2
    Maybe a dot chart or some other graph is a better option as one of your axes is categorical. If you give a data example using dataex, it should be possible to provide code.

    Comment


    • #3
      The mean of an indicator is the proportion you want. Then percents are just a display issue. graph dot and graph bar and graph hbar all default to showing means.

      Here is a minimal example.

      Code:
      sysuse auto, clear
      
      graph dot foreign, over(rep78, descending) yla(0 "0" .25 "25" .5 "50" .75 "75" 1 "100") ytitle(% of foreign cars) ysc(alt) l1title(Repair record 1978) ysc(r(-0.03 1.01))
      Click image for larger version

Name:	dotchart.png
Views:	1
Size:	23.1 KB
ID:	1540857


      Extending that successfully to what? 50 US states plus DC? Puerto Rico? etc. and possibly different years is however a severe challenge, but as Andrew Musau signals we need more detail to give better answers.

      Comment


      • #4
        Andrew Musau and Nick Cox Thank you very much. Please forgive my lack of details; I'm still fairly new to all this and part of the issue I have is I'm not always sure how to ask the question I have.

        What I want to do is create a scatterplot where the Y-axis is the percent of folks who take up food stamps (ie the % of 1's in the sample for each state) and the X-axis is the year (the data I have are 2003, 2006, and 2010). So, there should be 51 (50 states + DC) points in each year bucket. You can find a rough drawing here.

        What I am trying to show is that a change in the legislation in several states made it easier for folks to take up food stamps, and thus we see an increase in their take up. We see this in the data, but I am unsure how to show it graphically.

        Comment


        • #5

          There are two steps then, to calculate the mean using egen say and then to plot means for each state against year.

          Comment


          • #6
            It will be a tall order to graph the 50 states + DC as stated by Nick in #3. If you can aggregate at a higher level than state, e.g., census region, then the better. Here is an illustration of the challenge. Even reducing the size of the marker labels does not help much.

            Code:
            sysuse census
            expand 3
            bys state: gen year=_n
            set seed 12032020
            gen foodstamps_pc=runiformint(0, 38)
            scatter foodstamps_pc year, mlabel(state) mlabsize(vsmall) msize(vsmall) ///
            xlab(1.15 "2003" 2.15 "2006" 3.15 "2010", noticks) xsc(r(0.75 3.5)) ytitle("") ///
            xtitle("") scheme(s1color) title("Percentage of population on food stamps")
            Click image for larger version

Name:	Graph.png
Views:	1
Size:	192.9 KB
ID:	1541100



            Aggregating at the census region results in a more readable graph

            Code:
            bys region year: egen foodstamps_rpc= mean(foodstamps_pc)
            scatter foodstamps_rpc year, mlabel(region) ///
            xlab(1.15 "2003" 2.15 "2006" 3.15 "2010", noticks) xsc(r(0.75 3.5)) ///
            ytitle("") xtitle("") scheme(s1color) title("Percentage of population on food stamps")
            Click image for larger version

Name:	Graph2.png
Views:	1
Size:	48.4 KB
ID:	1541101

            Comment


            • #7
              Borrowing Andrew's example, here is another alternative. There is a discussion of this type of graph here: http://www.maartenbuis.nl/workshops/...l#slide51.smcl

              Code:
              sysuse census, clear
              set scheme s1color
              expand 3
              bys state: gen year=_n
              set seed 12032020
              gen foodstamps_pc=runiformint(0, 38)
              
              // order the states by average percentage footstamps----------------------------
              bys state (year): egen av_foodstamps = mean(foodstamps_pc)
              //sort from highest to lowest
              replace av_foodstamps = -av_foodstamps
              
              // requires -egenmore- from SSC
              egen State = axis(av_foodstamps state), label(state)
              
              // create background graph -----------------------------------------------------
              tempfile temp
              save `temp'
              
              keep State year foodstamps_pc
              reshape wide foodstamps_pc, j(State) i(year)
              merge 1:m year using `temp'
              
              forvalues i = 1/50 {
                  local backgr `backgr' line foodstamps_pc`i' year, ///
                      lcolor(gs12) lpattern(solid) ||
              }
              
              // bring it together -----------------------------------------------------------
              sort State year
              twoway `backgr' line foodstamps_pc year,      ///
                   by(State , compact note("") legend(off)) ///
                   lpattern(solid) lwidth(*2)
              Click image for larger version

Name:	Graph.png
Views:	1
Size:	653.1 KB
ID:	1541138
              ---------------------------------
              Maarten L. Buis
              University of Konstanz
              Department of history and sociology
              box 40
              78457 Konstanz
              Germany
              http://www.maartenbuis.nl
              ---------------------------------

              Comment


              • #8
                Some other ideas:

                Use two-letter codes such as OH WY MI

                Stripplots etc.

                Jittering

                Comment


                • #9
                  Hi all,

                  Thanks again for all your help and suggestions. I really appreciate it. However, after consulting with my dissertation advisor, I'm going in a different direction, so all this (while helpful) is getting put on the back burner.

                  Again, I appreciate everything y'all have done and I feel bad that things are getting delayed. But this was important and helpful for answering my question

                  Comment

                  Working...
                  X