Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • side by side boxplot

    Hi everyone,

    I am fairly new to STATA and could not find an answer to this on the previous forums. I wanted to create multiple boxplots displaying categories of variables side by side. I have a dataset of 24 cytokine concentrations (individual variables) from animal model with each animal of different weight category (1 variable of BMI category with underweight, normal, overweight, overweight). I would like to display the boxplot of cytokine concentrations in different weight categories side by side. When I plot the cytokines and use the over(bmicat) it plots the boxplots of cytokines grouped by bmi category. Like this:
    cytokine bargraph.pdf
    graph box ln_elf_ifny_2 ln_elf_il10_2 ln_elf_il12p70_2 ln_elf_il13_2 ln_elf_il1b_2, over(bmicat)

    I was hoping to have the cytokines grouped together with each grouping containing boxplots of the different category weights. So each group will have one cytokine and 4 different boxplots of cytokine concentrations broken down by weight categories side by side.

    Thank you for any help or suggestions.
    Last edited by Yue Qiu; 08 Sep 2022, 12:33.

  • #2
    This needs a change of data structure. temporary or otherwise, somewhat like this. In practice, some informative variable labels are needed both here for what I have called a ,b c d e and in your case for the various ln_elf* variables.

    Code:
    * sandbox dataset 
    clear 
    set obs 100 
    set seed 2803 
    set scheme s1color 
    
    gen g = ceil(_n/25)
    label def g 1 Underweight 4 Obese 2 Normal 3 Overweight 
    label val g g 
    
    foreach v in a b c d e { 
        gen `v' = rnormal(0, 1)
    }
    
    graph box a b c d e, over(g)
    
    * you start here using your variable names 
    preserve 
    rename (a b c d e) (y=)
    gen long id = _n 
    reshape long y, i(id) j(which) string 
    separate y, by(g) veryshortlabel
    graph box y?, by(which, compact note("") row(1))
    restore

    Comment


    • #3
      thank you so much! thank you for taking me through the process of how to do this.

      Comment


      • #4
        Hello. I have the exact same boxplot question as Yue Qiu, but I don't really understand Nick's posted answer. I'm also quite new to Stata and am mostly using the drop-down windows to generate my graphs. Could you explain in words how the data should be restructured in order to have the boxplot categories placed side by side within each variable? Thank you.

        Comment


        • #5
          I could try, but I won't, because you would get a lengthy explanation that would not be at all easy to grasp. I am not new to Stata but routinely find lengthy word explanations here hard to grasp. The way to break out of this dilemma is just to run the code step by step and at each stage look at the data using list and/or edit.

          For that purpose, a smaller sample than in #2 would do no harm.

          Code:
          * sandbox dataset 
          clear 
          set obs 20 
          set seed 2803 
          set scheme s1color 
          
          gen g = ceil(_n/5)
          label def g 1 Underweight 4 Obese 2 Normal 3 Overweight 
          label val g g 
          
          foreach v in a b c d e { 
              gen `v' = rnormal(0, 1)
          }
          
          graph box a b c d e, over(g)
          
          * you start here using your variable names 
          preserve 
          rename (a b c d e) (y=)
          gen long id = _n 
          reshape long y, i(id) j(which) string 
          separate y, by(g) veryshortlabel
          
          list 
          
          graph box y?, by(which, compact note("") row(1))
          restore
          
          
          ​​​​​​​

          Comment


          • #6
            Thank you. When I ran your code, I can see that it is producing exactly the type of graph I'm looking for. I had more or less figured out the needed data structure after reading a number of other posts and tutorials, so I manipulated my data in Excel, then brought it back into Stata to get the graph I wanted. However, it would be much more elegant to be able to do it within Stata. I'll give it a try with my original dataset and your code.

            Comment

            Working...
            X