Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • statplot

    Code:
    statplot Gender-Asa-ethnicity-social_depriv-provider-robotic, bar(1,bfcolor(green*0.4)) over(procedure_type) recast(bar) yla(0 0.25 "25" 0.5 "50" 0.75 "75", ang(h)) ytitle(Percent)
    Stata reports back with 'invalid name'

    I have tried replacing Percent with " on either side

    What am I doing wrong?

  • #2
    Just removed the hyphens ! And it worked as otherwise stata detected it as one whole variable

    Comment


    • #3
      Questions re stataplot Nick Cox

      Code:
      statplot gender asa ethnicity provider, bar(1, bfcolor(green*0.4)) over(procedure) recast(bar) yla(0 0.25 "25" 0.5 "50" 0.75 "75", ang(h)) ytitle(Percent)

      1. I manually changed the size of the labels '1 male or 2 female' to
      angulated and vsmall, how can I add this to the code?

      2. Y axis - I have done this as percent but actually I would like to show 'number'
      How do I do this in the code ? Do i just delete this section yla(0 0.25 "25" 0.5 "50" 0.75 "75" because otherwise it doesn't work?

      3. How do I include the colour for THR or TKR to be 2 different separate colors? Right now they're both green

      4. With regards to ethnicity, as you can see I have options from 1-8. I assume stata is totalling just the numbers of observations in ethnicity?
      I am planning to show the proportions pre-propensity score matching. And then show how the populations within THR and TKR become equally similar post matching.
      However, with regards to Ethnicity as I have more than one ordinal observation (1-8) I don't understand how stata is interpreting and plotting this? Can you pls confirm.

      Please note:
      Rather than the labels showing up as '1 male or 2 female' I will change the label of the variables to something
      more readable eg 'Gender'.


      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(procedure gender asa Anesthesia ethnicity provider)
      1 1 1 0 1 0
      1 1 2 1 2 1
      1 2 2 1 3 1
      2 2 3 1 4 0
      2 1 2 0 5 1
      2 2 1 1 6 1
      2 1 1 0 8 0
      2 1 3 1 8 1
      end
      label values procedure type
      label def type 1 "THR", modify
      label def type 2 "TKR", modify
      label values gender sex
      label def sex 1 "Male", modify
      label def sex 2 "female", modify
      label values asa ana
      label def ana 1 "general", modify
      label def ana 2 "regional", modify
      label values ethnicity ethn
      label def ethn 1 "white", modify
      label def ethn 2 "mixed", modify
      label def ethn 3 "black", modify
      label def ethn 4 "african", modify
      label def ethn 5 "Indian", modify
      label def ethn 6 "black other", modify
      label def ethn 8 "pakistani", modify
      label values provider prov
      label def prov 0 "nhs", modify
      label def prov 1 "private", modify
      Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	29.3 KB
ID:	1705028

      Comment


      • #4
        statplot is from SSC, as you are asked to explain. (People are asked to say where community-contributed commands come from: FAQ Advice #12.)

        Your existing plot is showing the means of various categorical variables, as documented.

        statistic() specifies the summary statistic used to summarize and plot varlist. The default is mean. See collapse for a full list of accepted statistics.
        Note that only one statistic may be specified.
        but the mean of a categorical variable is in general only interesting or useful if that categorical variable is binary and coded (0, 1). Mean ethnicity is is especially useless as dependent arbitrarily on coding.

        You don't want means, but if you want frequencies for all of

        THR or TKR by gender: a 2 x 2 table

        ditto by asa but a 2 x 3 table

        ditto by ethnicity at least a 2 x 7 table

        ditto by provider: a 2 x 2 table

        all on one graph, I think that is beyond statplot.

        Comment


        • #5
          I found this solution

          Code:
          statplot gender asa ethnicity provider, statistic(count) over(procedure) recast("bar")
          I addressed Q2. but have to manually change the labels. Would be great if someone could point out how to change the labels to vsmall incorporated in the code.

          Following propensity score matching, some of the observations will be eliminated and thus the bins within each variable should approach each other.
          This shouldn't be an issue with those that are binary - two options available 0 or 1.

          But for ethnicity, with more than one ordinal variable how can I make sure that only those used in matching are plotted.
          Perhaps do I need to use fweights

          Here in this paper:
          http://www.lindenconsulting.org/docu...ce_Article.pdf

          He states: - this would be equivalent to frequency
          In a histogram, the data are divided into non-overlapping intervals (bins), and the number of data points within each interval is counted. The graph depicts these frequency counts – the bar is centred at the midpoint of each interval – and its height reflects the average number of data points in the interval.

          Comment


          • #6
            Code:
            statplot gender asa ethnicity provider, statistic(count) over(procedure) recast("bar")
            That's not a good solution to anything. It's just the category counts for procedure repeated regardless of the other variables.

            Comment


            • #7
              Originally posted by Nick Cox View Post
              Code:
              statplot gender asa ethnicity provider, statistic(count) over(procedure) recast("bar")
              That's not a good solution to anything. It's just the category counts for procedure repeated regardless of the other variables.
              Thanks for your input Prof Cox, I suppose you wouldn’t have any other suggestions in terms of using frequencies and presenting them in a histogram as described by Prof Linden…
              Last edited by Tara Boyle; 09 Mar 2023, 16:46.

              Comment


              • #8
                I have not read Ariel's paper, but I looked at his Figures. All your data are categorical, so several of his graphs aren't pertinent.

                If by histogram you mean a bar chart of category counts, there are many ways to do it. Here are three:

                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input float(procedure gender asa Anesthesia ethnicity provider)
                1 1 1 0 1 0
                1 1 2 1 2 1
                1 2 2 1 3 1
                2 2 3 1 4 0
                2 1 2 0 5 1
                2 2 1 1 6 1
                2 1 1 0 8 0
                2 1 3 1 8 1
                end
                label values procedure type
                label def type 1 "THR", modify
                label def type 2 "TKR", modify
                label values gender sex
                label def sex 1 "male", modify
                label def sex 2 "female", modify
                
                set scheme s1color 
                graph bar (count), over(procedure) over(gender) name(G1, replace)
                
                * ssc install catplot 
                catplot procedure gender, recast(bar) name(G2, replace)
                
                * install from Stata Journal 
                tabplot procedure gender, showval name(G3, replace)
                In this case, but not always, catplot echoes graph bar (count).

                There are naturally many options to tune what is shown. You can also change variable roles.

                Comment


                • #9
                  Yes in this case I was referring to his categorical presentation of data from which I quoted his article, himself having plotted a histogram showing frequency counts at each interval.
                  ​​​​​​

                  i believe in Stata a histogram is termed barchart. With regards to catplot and tabplot, what is the difference between the two and statlplot ? As you are still using graphbar(count) and I thought you hadn’t recommended this in post #6. Although I may have interpreted this incorrectly.

                  Comment


                  • #10
                    Stata has histogram and twoway histogram and graph bar and twoway bar, and some others, and so those are the distinctions it makes.

                    Those distinctions aren't that restrictive. Only yesterday I discovered, or re-discovered, that Stata won't draw histograms with its own commands when you have analytic weights, so I cheerfully did the calculation myself and fired up twoway bar.

                    They don't necessarily bear on how users think about their graphs. Some people in statistical communities insist that a histogram is only a bar chart representation with touching bars and that (e.g.) a bar chart showing frequencies of a nominal variable is not a histogram. That's perhaps historic usage, but I am happy to think that a histogram is just a particular kind of bar chart and don't see any strong objection to any bar chart whatsoever showing frequencies, proportions, percents or densities being called a histogram. But reviewers and examiners might have prejudices on this detail.

                    I wrote catplot in 2004 because Stata did not directly support what some years later was implemented as graph bar (count), but my syntax was necessarily different. catplot is however a wrapper for graph bar or graph hbar or graph dot depending what the user chooses, so many of its options are as documented for those official commands. I suspect that catplot is a little more versatile, and I consider that it is more transparent about how to work with percents, than the official command, but always distrust a programmer familiar with their own work. Also, I haven't used graph bar (count) that much. If it had existed in 2004 catplot would possibly not have been written.

                    tabplot is a wrapper for twoway bar and its syntax is again a mix of syntaxes. It mostly has different goals.

                    The point in #6 is quite different and was only that statplot (SSC) with your syntax doesn't do anything useful.

                    Some people prefer not to use community-contributed commands or are unable to install them because of workplace policy on downloads.

                    Otherwise the best way to find out about differences between these commands is to study the help files and run some of the examples.

                    Comment

                    Working...
                    X