Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Working with variables

    Hi all,

    I'm hoping you may be able to help/point me to guidance for some rather basic questions.

    1) I have a list of ages, with a lot of missing data. To get the mean age is it best to use mean(AgeVar) or should I drop the missing values?

    2) I'd like to create a new var for someone who has read topicXbin and topicYbin - I have 10 different topic_bins, and I want to create a new var for anyone who has a '1' in any 2 of these topic_bins. What's the best way of doing this?

    3) Similarly, I have two binary variables and I would like to create a new one where someone is assigned a '1' if they have a '1' in either of the two variables - and a '0' if they have a '0' in both of them

    4) Is it possible to sum across variables? For example, I have a 1/0 value in topic1bin, topic2bin, and topic3bin. Can I create a new var (topictotalbin) that would add all of the 1s in the three variables?

    Thanks in advance for any guidance, and please let me know if any of the above isn't clear.

  • #2
    It is possible to make guesses at what you want but the standing advice at FAQ #12 to give data examples remains good for future questions -- and if any of these guesses looks puzzling.

    1) There is no difference. Every calculation of mean age or any other numeric variable using inbuilt commands will ignore missing values any way. Don't drop what Stata will ignore. That may compromise other analyses any way. (Conversely, what difference do you imagine would happen? How would the average of 1, 2, 3 and missing differ from the average of 1, 2, 3?)

    2) This sounds like

    Code:
    egen wanted = rowtotal(topic*bin)
    replace wanted = wanted >= 2
    assuming that "any 2" means "2 or more" -- or

    Code:
    replace wanted = wanted == 2
    if it means "precisely 2".

    3) Something like

    Code:
    gen answer = max(q1, q2)
    will map two zeros to zero and one or two ones to one.

    4) Already answered above. Consider also just plain addition such as

    Code:
    gen newvar = topic1bin + topic2bin  + topic3bin

    See help egen and if need be https://www.stata-journal.com/sjpdf....iclenum=pr0046

    Last edited by Nick Cox; 16 Jun 2018, 04:26.

    Comment


    • #3
      Hi Nick,

      Thank you for the quick reply! This is incredibly helpful.

      Everything works great, with the exception of summing the variables. It worked once, when I tested it, but now I've applied it to create a few new variables it doesn't seem to be catching all of the 1s in the other variables. Will have to try again tomorrow, and hope for better luck.

      Thanks again!

      Comment


      • #4
        Any problem can't be a matter of luck, but to explain what's wrong, we have to see what's wrong. Data example and code, please. (But missing values might be an issue.)

        Comment

        Working...
        X