Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting observations in one column based on another column

    I have a dataset concerning music that has one column which contains the number of plays the song has gotten, a column with the year, a column with the song name, and a column that records a "1" if the song contains the word "today", "0" otherwise. If I wanted to count how many plays songs with the word "today" has gotten, how would I go about that? I have tried
    count plays if title_today>0
    but this does not work.

  • #2
    Please see FAQ (http://www.statalist.org/forums/help) on how to provide some sample data set using -dataex-. Without knowing the format of the variables it's hard to help.

    If I wanted to count how many plays songs with the word "today" has gotten
    This depends on what you meant by "how many". Mean, median, mode, sum or others? You may try:

    Code:
    summarize plays if title_today > 0 & title_today < ., detail
    
    * To also see the total sum, type:
    display r(sum)
    Edit: My update also crossed with Clyde's (#3). I agree with his advice as well.
    Last edited by Ken Chui; 08 Feb 2022, 17:31.

    Comment


    • #3
      I think O.P. means total (sum) of the number of plays. It also appears from his own attempt that a display of the result is wanted, not creation of a new variable, nor storage of the result in a macro. So
      Code:
      summ plays if title_today  == 1, meanonly
      display `r(sum)'
      Notes:
      1. As Ken Chui suggested in #2, it would be helpful to use -dataex- and show example data. The code above will fail if certain assumptions about the nature of your data set which cannot be fully discerned from what you have shown are false. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      When asking for help with code, always show example data. When showing example data, always use -dataex-.

      It is also a good idea to avoid saying that something "didn't work." There are any number of ways that code can go wrong, and it is helpful to those who want to help you if you give more precise information on just what happened. That means showing the code you used (which you did), and also any output you got from Stata in the Results window, including (especially) error messages or warnings. It is also usually a good idea, especially if there are no error messages or warnings, to show an example of the data as it was after the code was run and, unless it is blatantly obvious, point out in what way it is not what was wanted.

      2. I changed your title_today > 0 to title_today == 1. These will be, for present purposes, equivalent if there are no missing values. But title_today > 0 will be considered true for any observation where title_today is missing, which is probably not what you want. title_today == 1 is safer.
      Last edited by Clyde Schechter; 08 Feb 2022, 17:32.

      Comment

      Working...
      X